《Mutation Clusters from Cancer Exome》
---
作者:
Zura Kakushadze and Willie Yu
---
最新提交年份:
2017
---
英文摘要:
We apply our statistically deterministic machine learning/clustering algorithm *K-means (recently developed in https://ssrn.com/abstract=2908286) to 10,656 published exome samples for 32 cancer types. A majority of cancer types exhibit mutation clustering structure. Our results are in-sample stable. They are also out-of-sample stable when applied to 1,389 published genome samples across 14 cancer types. In contrast, we find in- and out-of-sample instabilities in cancer signatures extracted from exome samples via nonnegative matrix factorization (NMF), a computationally costly and non-deterministic method. Extracting stable mutation structures from exome data could have important implications for speed and cost, which are critical for early-stage cancer diagnostics such as novel blood-test methods currently in development.
---
中文摘要:
我们应用统计确定性机器学习/聚类算法*K-means(最近在https://ssrn.com/abstract=2908286)至10656份已发表的32种癌症类型的外显子组样本。大多数癌症类型表现出突变聚类结构。我们的结果在样本中是稳定的。当应用于14种癌症类型的1389个已发表的基因组样本时,它们也是样本外稳定的。相反,我们发现通过非负矩阵因式分解(NMF)从外显子组样本中提取的癌症特征存在样本内和样本外不稳定性,这是一种计算成本高且不确定的方法。从外显子组数据中提取稳定的突变结构可能会对速度和成本产生重要影响,这对于早期癌症诊断至关重要,例如目前正在开发的新型血液检测方法。
---
分类信息:
一级分类:Quantitative Biology 数量生物学
二级分类:Genomics 基因组学
分类描述:DNA sequencing and assembly; gene and motif finding; RNA editing and alternative splicing; genomic structure and processes (replication, transcription, methylation, etc); mutational processes.
DNA测序与组装;基因和基序的发现;RNA编辑和选择性剪接;基因组结构和过程(复制、转录、甲基化等);突变过程。
--
一级分类:Quantitative Biology 数量生物学
二级分类:Quantitative Methods 定量方法
分类描述:All experimental, numerical, statistical and mathematical contributions of value to biology
对生物学价值的所有实验、数值、统计和数学贡献
--
一级分类:Quantitative Finance 数量金融学
二级分类:Statistical Finance 统计金融
分类描述:Statistical, econometric and econophysics analyses with applications to financial markets and economic data
统计、计量经济学和经济物理学分析及其在金融市场和经济数据中的应用
--
---
PDF下载:
-->