Genome and metagenome comparisons based on large amounts of next generation sequencing (NGS) data pose significant challenges for alignment-based approaches due to the huge data size and the relatively short length of the reads. Alignment-free approaches based on the counts of word patterns in NGS data do not depend on the complete genome and are generally computationally efficient. Thus, they contribute significantly to genome and metagenome comparison. Recently, novel statistical approaches have been developed for the comparison of both long and shotgun sequences. These approaches have been applied to many problems including the comparison of gene regulatory regions, genome sequences, metagenomes, binning contigs in metagenomic data, identification of virus-host interactions, and detection of horizontal gene transfers. We provide an updated review of these applications and other related developments of word-count based approaches for alignment-free sequence analysis.
Alignment-Free Sequence Analysis and Applications.
Jie Ren,Xin Bai,Yang Young Lu,Kujin Tang,Ying Wang,G. Reinert,Fengzhu Sun
Published 2018 in Annual Review of Biomedical Data Science
ABSTRACT
PUBLICATION RECORD
- Publication year
2018
- Venue
Annual Review of Biomedical Data Science
- Publication date
2018-03-26
- Fields of study
Biology, Computer Science, Mathematics, Environmental Science, Medicine
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
CITED BY
Showing 1-93 of 93 citing papers · Page 1 of 1