Summary Linked-read sequencing enables greatly improves haplotype assembly over standard paired-end analysis. The detection of mosaic single-nucleotide variants benefits from haplotype assembly when the model is informed by the mapping between constituent reads and linked reads. Samovar evaluates haplotype-discordant reads identified through linked-read sequencing, thus enabling phasing and mosaic variant detection across the entire genome. Samovar trains a random forest model to score candidate sites using a dataset that considers read quality, phasing, and linked-read characteristics. Samovar calls mosaic single-nucleotide variants (SNVs) within a single sample with accuracy comparable with what previously required trios or matched tumor/normal pairs and outperforms single-sample mosaic variant callers at minor allele frequency 5%–50% with at least 30X coverage. Samovar finds somatic variants in both tumor and normal whole-genome sequencing from 13 pediatric cancer cases that can be corroborated with high recall with whole exome sequencing. Samovar is available open-source at https://github.com/cdarby/samovar under the MIT license.
Samovar: Single-Sample Mosaic Single-Nucleotide Variant Calling with Linked Reads
Charlotte A. Darby,James R. Fitch,P. Brennan,B. Kelly,Natalie Bir,V. Magrini,J. Leonard,C. Cottrell,J. Gastier-Foster,R. Wilson,E. Mardis,P. White,Ben Langmead,M. Schatz
Published 2019 in iScience
ABSTRACT
PUBLICATION RECORD
- Publication year
2019
- Venue
iScience
- Publication date
2019-05-29
- Fields of study
Biology, Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-53 of 53 references · Page 1 of 1
CITED BY
Showing 1-10 of 10 citing papers · Page 1 of 1