The increasing number of sequenced genomes has prompted the development of several automated orthology prediction methods. Tests to evaluate the accuracy of predictions and to explore biases caused by biological and technical factors are therefore required. We used 70 manually curated families to analyze the performance of five public methods in Metazoa. We analyzed the strengths and weaknesses of the methods and quantified the impact of biological and technical challenges. From the latter part of the analysis, genome annotation emerged as the largest single influencer, affecting up to 30% of the performance. Generally, most methods did well in assigning orthologous group but they failed to assign the exact number of genes for half of the groups. The publicly available benchmark set (http://eggnog.embl.de/orthobench/) should facilitate the improvement of current orthology assignment protocols, which is of utmost importance for many fields of biology and should be tackled by a broad scientific community.
Orthology prediction methods: A quality assessment using curated protein families
Kalliopi Trachana,T. Larsson,Sean Powell,Wei-Hua Chen,T. Doerks,J. Muller,P. Bork
Published 2011 in Bioessays
ABSTRACT
PUBLICATION RECORD
- Publication year
2011
- Venue
Bioessays
- Publication date
2011-10-01
- Fields of study
Biology, Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
CONCEPTS
- exact gene count
The precise number of genes that should be assigned to an orthologous group in the curated reference.
Aliases: gene number, number of genes
박진우 (dztg5apj7m) extractionB (s683577b42) reviewAnonymous (n259mg7uxy) reviewGahyoun (29ua5897xm) review - genome annotation
The quality and completeness of genome gene models and annotations considered as a technical factor in the benchmark.
Aliases: annotation quality
박진우 (dztg5apj7m) extractionB (s683577b42) reviewAnonymous (n259mg7uxy) reviewGahyoun (29ua5897xm) review - manually curated families
A set of 70 protein families manually curated for use as the evaluation reference.
Aliases: 70 curated families, curated protein families
박진우 (dztg5apj7m) extractionB (s683577b42) reviewAnonymous (n259mg7uxy) reviewGahyoun (29ua5897xm) review - metazoa
The animal clade used as the taxonomic scope for evaluating the methods.
Aliases: metazoans
박진우 (dztg5apj7m) extractionB (s683577b42) reviewAnonymous (n259mg7uxy) reviewGahyoun (29ua5897xm) review - orthobench
A publicly available benchmark set of curated families for assessing orthology prediction methods.
Aliases: http://eggnog.embl.de/orthobench/
박진우 (dztg5apj7m) extractionB (s683577b42) reviewAnonymous (n259mg7uxy) reviewGahyoun (29ua5897xm) review - orthologous group
A cluster of genes inferred to descend from a common ancestral gene by orthology relationships.
Aliases: OG, ortholog group
박진우 (dztg5apj7m) extractionB (s683577b42) reviewAnonymous (n259mg7uxy) reviewGahyoun (29ua5897xm) review - orthology assignment protocols
Procedures and workflows used to assign genes to orthologous groups.
Aliases: orthology protocols
박진우 (dztg5apj7m) extractionB (s683577b42) reviewAnonymous (n259mg7uxy) reviewGahyoun (29ua5897xm) review - orthology prediction methods
Automated methods used to infer orthologous relationships among genes across species.
Aliases: orthology assignment methods, ortholog prediction methods
박진우 (dztg5apj7m) extractionB (s683577b42) reviewAnonymous (n259mg7uxy) reviewGahyoun (29ua5897xm) review
REFERENCES
Showing 1-60 of 60 references · Page 1 of 1