{"corpus_id":135371,"paper_sha":"a46320f5516f855ad6b56e3bca584a70f5a954d4","doi":"10.1111/j.1365-294X.2011.05239.x","arxiv_id":null,"pmid":21883587,"pmcid":null,"mag_id":2950812077,"dblp_id":null,"acl_id":null,"title":"ABGD, Automatic Barcode Gap Discovery for primary species delimitation","year":2012,"publication_date":"2012-04-01","venue":"Molecular Ecology","journal":{"name":"Molecular Ecology","pages":null,"volume":"21"},"journal_issn":null,"journal_title":null,"publication_types":["JournalArticle","Study"],"pubmed_pub_types":["Evaluation Study","Journal Article","Research Support, Non-U.S. Gov't"],"s2_fields_of_study":["Biology","Medicine","Computer Science"],"reference_count":76,"citation_count":2789,"influential_citation_count":504,"is_open_access":false,"arxiv_categories":null,"arxiv_license":null,"arxiv_journal_ref":null,"mesh_headings":[{"d":"Automation","mj":false,"ui":"D001331"},{"d":"Base Sequence","mj":false,"ui":"D001483"},{"d":"Computational Biology","mj":false,"qs":[{"q":"methods","mj":true,"ui":"Q000379"}],"ui":"D019295"},{"d":"DNA","mj":false,"qs":[{"q":"analysis","mj":false,"ui":"Q000032"},{"q":"genetics","mj":false,"ui":"Q000235"}],"ui":"D004247"},{"d":"DNA Barcoding, Taxonomic","mj":false,"qs":[{"q":"methods","mj":true,"ui":"Q000379"}],"ui":"D058893"},{"d":"DNA, Mitochondrial","mj":false,"qs":[{"q":"genetics","mj":false,"ui":"Q000235"}],"ui":"D004272"},{"d":"Phylogeny","mj":false,"ui":"D010802"},{"d":"Sensitivity and Specificity","mj":false,"ui":"D012680"},{"d":"Sequence Alignment","mj":false,"ui":"D016415"},{"d":"Sequence Analysis, DNA","mj":false,"ui":"D017422"},{"d":"Species Specificity","mj":false,"ui":"D013045"}],"chemicals":[{"n":"DNA, Mitochondrial","ui":"D004272","reg":"0"},{"n":"DNA","ui":"D004247","reg":"9007-49-2"}],"comments_corrections":null,"source_flags":5,"s2_open_access_pdf_url":null,"s2_open_access_landing_url":null,"s2_open_access_license":null,"s2_open_access_status":null,"pmc_open_access_pdf_url":null,"pmc_open_access_landing_url":null,"pmc_open_access_license":null,"pmc_open_access_status":null,"unpaywall_open_access_pdf_url":null,"unpaywall_open_access_landing_url":null,"unpaywall_open_access_license":null,"unpaywall_open_access_status":null,"abstract":"Within uncharacterized groups, DNA barcodes, short DNA sequences that are present in a wide range of species, can be used to assign organisms into species. We propose an automatic procedure that sorts the sequences into hypothetical species based on the barcode gap, which can be observed whenever the divergence among organisms belonging to the same species is smaller than divergence among organisms from different species. We use a range of prior intraspecific divergence to infer from the data a model-based one-sided confidence limit for intraspecific divergence. The method, called Automatic Barcode Gap Discovery (ABGD), then detects the barcode gap as the first significant gap beyond this limit and uses it to partition the data. Inference of the limit and gap detection are then recursively applied to previously obtained groups to get finer partitions until there is no further partitioning. Using six published data sets of metazoans, we show that ABGD is computationally efficient and performs well for standard prior maximum intraspecific divergences (a few per cent of divergence for the five data sets), except for one data set where less than three sequences per species were sampled. We further explore the theoretical limitations of ABGD through simulation of explicit speciation and population genetics scenarios. Our results emphasize in particular the sensitivity of the method to the presence of recent speciation events, via (unrealistically) high rates of speciation or large numbers of species. In conclusion, ABGD is fast, simple method to split a sequence alignment data set into candidate species that should be complemented with other evidence in an integrative taxonomic approach.","claims":[{"public_id":"cl_8d918b45ff32f49c27e018b9618db95f","status":"active","text":"ABGD is a fast, simple method for splitting a sequence alignment data set into candidate species and is intended to be complemented by other evidence in integrative taxonomy.","confidence":0.9,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_8d918b45ff32f49c27e018b9618db95f"},{"public_id":"cl_96f47053f539a551d820db8308e03e41","status":"active","text":"ABGD is computationally efficient and performs well on six published metazoan data sets for standard prior maximum intraspecific divergences, except for one data set with fewer than three sequences per species.","confidence":0.95,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_96f47053f539a551d820db8308e03e41"},{"public_id":"cl_97561be0aaa982278580329e4ea32ac5","status":"active","text":"Automatic Barcode Gap Discovery partitions DNA barcode sequence data into candidate species by detecting the first significant gap beyond an inferred upper limit on intraspecific divergence.","confidence":0.98,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_97561be0aaa982278580329e4ea32ac5"},{"public_id":"cl_46e93e1fcbd857a2bf5e2050ec939346","status":"active","text":"Recursive application of the divergence-limit inference and barcode-gap detection yields finer partitions until no further partitioning is possible.","confidence":0.91,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_46e93e1fcbd857a2bf5e2050ec939346"},{"public_id":"cl_1fa53970bbd4e4240cf00107edfc9363","status":"active","text":"The method is sensitive to recent speciation events, especially under unrealistically high speciation rates or when many species are present.","confidence":0.93,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_1fa53970bbd4e4240cf00107edfc9363"}],"concepts":[{"public_id":"co_0e31a1c9b029fd9347d00eb223ed4270","status":"active","name":"barcode gap","description":"The gap between within-species sequence divergence and between-species sequence divergence used to separate groups.","types":["phenomenon"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_0e31a1c9b029fd9347d00eb223ed4270"},{"public_id":"co_10e59cc9f47c63c39d3caddd2be52276","status":"active","name":"speciation rates","description":"Rates at which new species arise in a lineage.","types":["parameter"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_10e59cc9f47c63c39d3caddd2be52276"},{"public_id":"co_2780305f6171711de53671dc48011d28","status":"active","name":"intraspecific divergence","description":"Sequence divergence among organisms belonging to the same species.","types":["measure"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_2780305f6171711de53671dc48011d28"},{"public_id":"co_7448de7c9fb49a9263b404777fc56a1d","status":"active","name":"Automatic Barcode Gap Discovery","description":"An automatic procedure for delimiting candidate species from DNA barcode sequence data by locating barcode gaps.","types":["method"],"aliases":["ABGD"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_7448de7c9fb49a9263b404777fc56a1d"},{"public_id":"co_88b7d994eecfe52c3f2195236af00ae0","status":"active","name":"prior maximum intraspecific divergence","description":"An assumed upper bound on within-species sequence divergence used as input to the method.","types":["parameter"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_88b7d994eecfe52c3f2195236af00ae0"},{"public_id":"co_9713f6e7256029ef32857b7a25277738","status":"active","name":"integrative taxonomic approach","description":"A species-delimitation framework that combines multiple lines of evidence.","types":["approach"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_9713f6e7256029ef32857b7a25277738"},{"public_id":"co_9afa0d5009b470209b7d292739850977","status":"active","name":"recursive partitioning","description":"Repeated subdivision of already obtained groups into finer partitions.","types":["method"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_9afa0d5009b470209b7d292739850977"},{"public_id":"co_e74005e2ae51c32e54707c8542f946b4","status":"active","name":"sequence alignment data set","description":"An aligned collection of DNA barcode sequences used as input for species partitioning.","types":["dataset"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_e74005e2ae51c32e54707c8542f946b4"},{"public_id":"co_f2c80281eb81a7a9cb3772646d235155","status":"active","name":"recent speciation events","description":"Very recent splitting of lineages into separate species.","types":["evolutionary process"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_f2c80281eb81a7a9cb3772646d235155"},{"public_id":"co_f983bd2e8ef667e650e9011fb2c0d8ff","status":"active","name":"metazoan data sets","description":"Published sequence data sets from animals used to evaluate the method.","types":["dataset"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_f983bd2e8ef667e650e9011fb2c0d8ff"}],"external_ids":{"DOI":"10.1111/j.1365-294X.2011.05239.x","ArXiv":null,"PubMed":21883587,"PubMedCentral":null,"MAG":2950812077,"DBLP":null,"ACL":null},"open_access":{"is_open_access":false,"pdf_url":null,"landing_url":"https://sah.borca.ai/papers/135371","source":null,"pdf_url_source":null,"license":null,"reason":"pdf_url_not_indexed"},"reference_availability":{"status":"available","references_indexed":true,"full_text_available":false,"full_text_source":null,"count_basis":"semantic_scholar_metadata","extraction_status":"not_applicable","reason":null},"source":{"provider":"episteme2","base_corpus":"semantic_scholar_dump","freshness_mode":"unknown","basis":["semantic_scholar_metadata","postgres_metadata"],"limits":["paper metadata is based on indexed upstream scholarly datasets","claims and concepts are available only for extracted papers","absence of claims or concepts means no extracted graph data is available in this response"],"status":"available","degraded":false,"degraded_reasons":[],"diagnostics":{"status":"available","degraded":false,"degraded_reasons":[],"metadata_status":"available","graph_status":"available","abstract_status":"available"},"source_flags":5},"paper_id":631042,"paper_uid":"4a1383dc-e0dd-4d44-b241-e0c7b3ab2fe2","canonical_identity":{"paper_id":631042,"paper_uid":"4a1383dc-e0dd-4d44-b241-e0c7b3ab2fe2","identity_status":"available","lookup_basis":"semantic_scholar_external_id","compatibility_path":"corpus_id"},"url":"https://sah.borca.ai/papers/135371"}