The presence of duplicates introduced by PCR amplification is a major issue in paired short reads from next-generation sequencing platforms. These duplicates might have a serious impact on research applications, such as scaffolding in whole-genome sequencing and discovering large-scale genome variations, and are usually removed. We present FastUniq as a fast de novo tool for removal of duplicates in paired short reads. FastUniq identifies duplicates by comparing sequences between read pairs and does not require complete genome sequences as prerequisites. FastUniq is capable of simultaneously handling reads with different lengths and results in highly efficient running time, which increases linearly at an average speed of 87 million reads per 10 minutes. FastUniq is freely available at http://sourceforge.net/projects/fastuniq/.
FastUniq: A Fast De Novo Duplicates Removal Tool for Paired Short Reads
Hai-bin Xu,Xiang Luo,J. Qian,X. Pang,Jingyuan Song,Guangrui Qian,Jinhui Chen,Shilin Chen
Published 2012 in PLoS ONE
ABSTRACT
PUBLICATION RECORD
- Publication year
2012
- Venue
PLoS ONE
- Publication date
2012-12-20
- Fields of study
Biology, Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-30 of 30 references · Page 1 of 1