摘要翻译:
在过去的几十年里,人们对DNA片段和序列进行了深入的研究。计算生物学的主要问题之一是利用数学技术识别基因内的外显子-内含子结构。以前的研究使用了不同的方法,如傅立叶分析和隐马尔可夫模型,以便能够预测基因的哪些部分对应于蛋白质编码区。本文将半马尔可夫模型应用于表征基因蛋白质编码区的3碱基周期序列。给出了相关概率和相应指标的解析形式,从而得到了对潜在周期模式的描述。最后,用合成数据和实际数据的DNA序列对前面的理论结果进行了验证。
---
英文标题:
《Investigating some attributes of periodicity in DNA sequences via
semi-Markov modelling》
---
作者:
Pavlos Kolias and Alexandra Papadopoulou
---
最新提交年份:
2019
---
分类信息:
一级分类:Statistics 统计学
二级分类:Applications 应用程序
分类描述:Biology, Education, Epidemiology, Engineering, Environmental Sciences, Medical, Physical Sciences, Quality Control, Social Sciences
生物学,教育学,流行病学,工程学,环境科学,医学,物理科学,质量控制,社会科学
--
一级分类:Quantitative Biology 数量生物学
二级分类:Other Quantitative Biology 其他定量生物学
分类描述:Work in quantitative biology that does not fit into the other q-bio classifications
不适合其他q-bio分类的定量生物学工作
--
---
英文摘要:
DNA segments and sequences have been studied thoroughly during the past decades. One of the main problems in computational biology is the identification of exon-intron structures inside genes using mathematical techniques. Previous studies have used different methods, such as Fourier analysis and hidden-Markov models, in order to be able to predict which parts of a gene correspond to a protein encoding area. In this paper, a semi-Markov model is applied to 3-base periodic sequences, which characterize the protein-coding regions of the gene. Analytic forms of the related probabilities and the corresponding indexes are provided, which yield a description of the underlying periodic pattern. Last, the previous theoretical results are illustrated with DNA sequences of synthetic and real data.
---
PDF链接:
https://arxiv.org/pdf/1907.03119