A flexible pipeline combining bioinformatic correction tools for prokaryotic and eukaryotic metabarcoding

Miriam I. Brandt,Blandine Trouche,Laure Quintric,P. Wincker,J. Poulain,S. Arnaud-Haond

Published 2019 in bioRxiv

ABSTRACT

Environmental metabarcoding is an increasingly popular tool for studying biodiversity in marine and terrestrial biomes. As metabarcoding with multiple markers, spanning several branches of the tree of life is becoming more accessible, bioinformatic pipelines need to accommodate both micro- and macro biologists. We built and tested a pipeline based on Illumina read correction with DADA2 allowing analysing metabarcode data from prokaryotic and eukaryotic life compartments. We implemented the option to cluster ASVs into Operational Taxonomic Units (OTUs) with swarm v2, a network-based clustering algorithm, and to further curate the ASVs/OTUs based on sequence similarity and co-occurrence rates using a recently developed algorithm, LULU. Finally, a flexible taxonomic assignment of the Amplicon Sequence Variants (ASVs) was added via the RDP Bayesian classifier or by BLAST. We validate this pipeline with ribosomal and mitochondrial markers using eukaryotic mock communities and 42 deep-sea sediment samples. The comparison of BLAST and the RDP Classifier underlined the potential of the latter to deliver very good assignments, but highlighted the need for a concerted effort to build comprehensive, yet specific databases adapted to the studied communities. The results underline the advantages of clustering and LULU-curation for producing metazoan biodiversity inventories, and show that LULU is an effective tool for filtering metazoan molecular clusters while avoiding arbitrary relative abundance filters. Overall conservative estimates of diversity can be obtained using DADA2 and LULU correction algorithms alone, or in combination with the clustering algorithm swarm v2 (i.e. to obtain ASVs or OTUs), depending on the objective of the study.

PUBLICATION RECORD

  • Publication year

    2019

  • Venue

    bioRxiv

  • Publication date

    2019-08-01

  • Fields of study

    Biology, Computer Science, Environmental Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-96 of 96 references · Page 1 of 1

CITED BY

Showing 1-14 of 14 citing papers · Page 1 of 1