Chimeric reads can be generated by in vitro recombination during the preparation of high-throughput sequencing libraries. Our attempt to detect biological recombination between the genomes of dengue virus (DENV; +ssRNA genome) and its mosquito host using the Illumina Nextera sequencing library preparation kit revealed that most, if not all, detected host–virus chimeras were artificial. Indeed, these chimeras were not more frequent than with control RNA from another species (a pillbug), which was never in contact with DENV RNA prior to the library preparation. The proportion of chimera types merely reflected those of the three species among sequencing reads. Chimeras were frequently characterized by the presence of 1-20 bp microhomology between recombining fragments. Within-species chimeras mostly involved fragments in opposite orientations and located less than 100 bp from each other in the parental genome. We found similar features in published datasets using two other viruses: Ebola virus (EBOV; -ssRNA genome) and a herpesvirus (dsDNA genome), both produced with the Illumina Nextera protocol. These canonical features suggest that artificial chimeras are generated by intra-molecular template switching of the DNA polymerase during the PCR step of the Nextera protocol. Finally, a published Illumina dataset using the Flock House virus (FHV; +ssRNA genome) generated with a protocol preventing artificial recombination revealed the presence of 1-10 bp microhomology motifs in FHV–FHV chimeras, but very few recombining fragments were in opposite orientations. Our analysis uncovered sequence features characterizing recombination breakpoints in short-read sequencing datasets, which can be helpful to evaluate the presence and extent of artificial recombination.
A Survey of Virus Recombination Uncovers Canonical Features of Artificial Chimeras Generated During Deep Sequencing Library Preparation
J. Peccoud,S. Lequime,Isabelle Moltini-Conclois,Isabelle Giraud,L. Lambrechts,C. Gilbert
Published 2018 in G3: Genes, Genomes, Genetics
ABSTRACT
PUBLICATION RECORD
- Publication year
2018
- Venue
G3: Genes, Genomes, Genetics
- Publication date
2018-03-27
- Fields of study
Biology, Medicine, Environmental Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-57 of 57 references · Page 1 of 1
CITED BY
Showing 1-27 of 27 citing papers · Page 1 of 1