Background The common approach for protein subfamily classification relies on grouping protein sequences according to their degree of similarity. However, there is no single sequence similarity threshold for accurately grouping sequences into isofunctional groups. Current subfamily classification methods use bottom-up clustering to construct a cluster hierarchy, then cut the hierarchy at the most appropriate locations to obtain a single partitioning. These methods usually integrate data such as protein sequence similarity, residue conservation within groups and HMM profiles. Despite this straightforward approach, results usually predict a great number of subfamilies with few members and limited biological meaning. The goal of this study is to identify subsets of functionally related sequences within a given superfamily. Since all proteins within a superfamily share a common ancestor, we hypothesize that functional diversity within superfamilies has arisen through a series of concerted changes that must have left an identifiable coevolutionary signal.
Using coevolution to improve protein subfamily classification
Franco L. Simonetti,M. Banchero,A. Berenstein,A. Chernomoretz,Cristina Marino Buslje
Published 2015 in BMC Bioinformatics
ABSTRACT
PUBLICATION RECORD
- Publication year
2015
- Venue
BMC Bioinformatics
- Publication date
2015-04-30
- Fields of study
Biology, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
- No references are available for this paper.
Showing 0-0 of 0 references · Page 1 of 1
CITED BY
Showing 1-9 of 9 citing papers · Page 1 of 1