Mutation Clusters from Cancer Exome

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type , Preprint 01 Jan 2017Embargo end date: 01 Jan 2017 English Publisher:Elsevier BVJournal:SSRN Electronic Journal (eissn: 1556-5068,

Copyright policy )Funded by:WT | unidentified

Authors: Kakushadze, Zura; Yu, Willie;

doi: 10.2139/ssrn.2945010 , 10.3390/genes8080201 , 10.48550/arxiv.1707.08504

pmid: 28809811

pmc: PMC5575665

arXiv: 1707.08504

Mutation Clusters from Cancer Exome

- Summary
- Subjects
- Metrics

Abstract

We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit a mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally-costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics, such as novel blood-test methods currently in development.

Related Organizations

Free University of Tbilisi
Georgia
Duke NUS Graduate Medical School
Singapore
National University of Singapore
Singapore

Keywords

Genomics (q-bio.GN), Statistical Finance (q-fin.ST), clustering; K-means; nonnegative matrix factorization; somatic mutation; cancer signatures; genome; exome; DNA; eRank; correlation; covariance; machine learning; sample; matrix; source code; quantitative finance; statistical risk model; industry classification, Quantitative Finance - Statistical Finance, Quantitative Biology - Quantitative Methods, Article, FOS: Economics and business, FOS: Biological sciences, Quantitative Biology - Genomics, Quantitative Methods (q-bio.QM)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average