摘要翻译:
从测序仪给出的不完全串开始组装DNA片段的问题,在试图得到完美答案时被归类为NP难问题,由于它关系到检测动物、农作物中危险害虫等相似性的可能性,在许多领域具有重要意义。为解决这个问题而创建的一些算法和数据结构是Needleman-Wunsch算法、DeBruijn图和工作在重叠图上的贪婪算法;这些方法试图从不同的方法来解决这个问题,这些方法给出了要讨论的某些优点和缺点。在这篇文章中,我们首先展示了对已经创建的DNA组装问题的解决方案所做的研究的摘要,随后提出了对同一问题的在线解决方案,尽管不考虑突变,但它将具有仅使用必要数量的读数来组装用户指定数量的基因的能力。
---
英文标题:
《Assembling sequences of DNA using an on-line algorithm based on DeBruijn
graphs》
---
作者:
Juan Manuel Ciro Restrepo, Andr\'es Felipe Zapata Palacio and Mauricio
Toro
---
最新提交年份:
2017
---
分类信息:
一级分类:Computer Science 计算机科学
二级分类:Data Structures and Algorithms 数据结构与算法
分类描述:Covers data structures and analysis of algorithms. Roughly includes material in ACM Subject Classes E.1, E.2, F.2.1, and F.2.2.
涵盖数据结构和算法分析。大致包括ACM学科类E.1、E.2、F.2.1和F.2.2中的材料。
--
一级分类:Quantitative Biology 数量生物学
二级分类:Other Quantitative Biology 其他定量生物学
分类描述:Work in quantitative biology that does not fit into the other q-bio classifications
不适合其他q-bio分类的定量生物学工作
--
---
英文摘要:
The problem of assembling DNA fragments starting from imperfect strings given by a sequencer, classified as NP hard when trying to get perfect answers, has a huge importance in several fields, because of its relation with the possibility of detecting similarities between animals, dangerous pests in crops, and so on. Some of the algorithms and data structures that have been created to solve this problem are Needleman Wunsch algorithm, DeBruijn graphs and greedy algorithms working on overlaps graphs; these try to work out the problem from different approaches that give place to certain advantages and disadvantages to be discussed. In this article we first expose a summary of the research done on already created solutions for the DNA assembly problem, to present later an on-line solution to the same matter, which, despite not considering mutations, would have the capacity of using only the necessary amount of readings to assemble an user specified amount of genes.
---
PDF链接:
https://arxiv.org/pdf/1705.05105