Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega

Fabian Sievers,A. Wilm,David Dineen,T. Gibson,K. Karplus,Weizhong Li,R. Lopez,Hamish McWilliam,M. Remmert,J. Söding,J. Thompson,D. Higgins

Published 2011 in Molecular Systems Biology

ABSTRACT

Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high‐quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high‐quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam.

PUBLICATION RECORD

Publication year
2011
Venue
Molecular Systems Biology
Publication date
2011-10-11
Fields of study
Biology, Medicine, Computer Science
Identifiers
DOI 10.1038/msb.2011.75 PMID 21988835 PMCID 3261699
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar, PubMed

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

The Pfam protein families database
2011cited by this paper
A Comprehensive Benchmark Study of Multiple Sequence Alignment Methods: Current Challenges and Future Perspectives
2011cited by this paper
Quality measures for protein alignment benchmarks
2010influential reference
Sequence embedding for fast construction of guide trees for multiple sequence alignment
2010influential reference
Issues in bioinformatics benchmarking: the case study of multiple sequence alignment
2010cited by this paper
SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building.
2010cited by this paper
Fast Statistical Alignment
2009cited by this paper
Rapid and Accurate Large-Scale Coestimation of Sequence Alignments and Phylogenetic Trees
2009cited by this paper
R-Coffee: a method for multiple alignment of non-coding RNA
2008cited by this paper
De novo identification of highly diverged protein repeats by probabilistic consistency
2008cited by this paper
PRALINETM: a strategy for improved multiple alignment of transmembrane proteins
2008cited by this paper
Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis
2008influential reference
Multiple alignment by aligning alignments
2007cited by this paper
The Pfam protein families database
2007cited by this paper
k-means++: the advantages of careful seeding
2007cited by this paper
Clustal W and Clustal X version 2.0
2007cited by this paper
PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences
2007influential reference
Comparison of prokaryotic diversity at offshore oceanic locations reveals a different microbiota in the Mediterranean Sea.
2006cited by this paper
BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark
2005cited by this paper
Protein homology detection by HMM?CHMM comparison
2005cited by this paper
The alignment of sets of sequences and the construction of phyletic trees: An integrated method
2005cited by this paper
Kalign – an accurate and fast multiple sequence alignment algorithm
2005cited by this paper
ProbCons: Probabilistic consistency-based multiple sequence alignment.
2005cited by this paper
The Jalview Java alignment editor
2004cited by this paper
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
2004influential reference
MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.
2002influential reference
T-Coffee: A novel method for fast and accurate multiple sequence alignment.
2000influential reference
DIALIGN: finding local similarities by multiple sequence alignment
1998cited by this paper
Dynamic Programming Alignment Accuracy
1998cited by this paper
Profile hidden Markov models
1998cited by this paper
HOMSTRAD: A database of protein structure alignments for homologous families
1998influential reference
The geometry of graphs and some of its algorithmic applications
1994cited by this paper
The neutral theory of molecular evolution.
1983cited by this paper
Rapid similarity searches of nucleic acid and protein data banks.
1983cited by this paper
The neutral theory of molecular evolution
1983cited by this paper
BIOINFORMATICS ORIGINAL PAPER
year unknowncited by this paper

CITED BY

Large temperature excursions have modest impacts on community composition in the high diversity gut microbiome of omnivorous American cockroaches (Periplaneta americana)
2026cites this paper
Conformational Remodeling Underlies Activity Loss in Disease-Linked Asparagine Synthetase Variant
2026cites this paper
Enhanced editing of Bifidobacterium lactis using the endogenous Type I-G CRISPR-Cas system
2026influential citation
Aspergillus fumigatus PolX1 is an early ancestor of vertebrate terminal deoxynucleotidyl transferases
2026cites this paper
Bridge helix of Cas12a is an allosteric regulator of R-loop formation and RuvC activation
2026cites this paper
Distinct denitrification phenotypes in closely related bacteria: clues to understanding variations in nitrite accumulation among Stutzerimonas strains
2026cites this paper
Fast, accurate construction of multiple sequence alignments from protein language embeddings
2026influential citation
Transport of the abundant intestinal amino acid glutamine by the enteric pathogen Campylobacter jejuni occurs via GutA (Cj0903), an AGCS family transporter
2026cites this paper
Regulatory divergence and functional diversification of a c-di-GMP-controlled sigma factor in Actinomycetota
2026cites this paper
A comprehensive catalogue of receptor-binding domains in extracellular contractile injection systems
2026cites this paper
Molecular Profile of Mucopolysaccharidosis Type I Patients in Brazil
2026cites this paper
Phylogeny-guided multi-epitope vaccine design targeting conserved and divergent regions of Brucella OMP2, OMP25, and OMP31: an immunoinformatics study
2026cites this paper
Histone variant H2A.Z enhances histone and nucleosome dynamics.
2026cites this paper
REM genes controlling phyllotaxis and yield: bridging findings from Arabidopsis thaliana to Brassica napus
2026cites this paper
Detection of Specificity-Determining Positions in Protein Families.
2026cites this paper
An anti-aggregation region of the SGS3 N-terminal IDR is essential for secondary siRNA biogenesis
2026influential citation
Regulation of hyphal development by protein kinase A, stress-responsive MAP kinases, and calcineurin via transcription factors Sfl1 and Sfl2 in Candida albicans
2026cites this paper
Rhodotorula mucilaginosa Growth on Glutamine Is Sensitive to Mammalian-Glutaminase Inhibitors
2026cites this paper
Surface loops may provide additional function necessary for enzyme-to-pseudoenzyme transition in the fungal metallocarboxypeptidase family.
2026cites this paper
Strategic variations in sarbecovirus and merbecovirus Nsp1 linker regions for translation inhibition
2026cites this paper
Neural microexons contain lengthened sequence and extended RNA structure between the branchpoint and splice site motif
2026influential citation
Genome-wide identification and multi-dimensional functional characterization of the SIR2 family in Brassica napus L.
2026cites this paper
Alloscopus ramanai sp. nov. (Orchesellidae, Heteromurinae), a new Collembola species from central Thailand, with complete mitochondrial genome and phylogenetic placement
2026cites this paper
Why are so few island bryophytes endemic?
2026cites this paper
Polar Marine Microbial Communities as Reservoirs of Polyester Degrading Enzymes
2026cites this paper
In Silico Identification of Antiviral Peptides as Potential Leads Against Sudan Ebolavirus VP‐40
2026cites this paper
AmpliPhy improves gene trees by adding homologs without affecting alignments
2026cites this paper
A conserved salt bridge network stabilizes the hepatic organic anion transporters OATP1B1 and OATP1B3.
2026cites this paper
Novel Bifunctional Enzyme AntO Catalyzes Antimonite Oxidation and H2O2 Decomposition in Environmental Antimony Detoxification.
2026cites this paper
A unified pipeline for discovering previously unknown enzyme activities
2026cites this paper
Dissecting PADI6 function defines oocyte cytoplasmic lattices as regulatory hubs for fundamental cellular processes
2026cites this paper
Three stress-responsive peptides modulate immunity and development in the fall armyworm Spodoptera frugiperda.
2026cites this paper
Fluorescent Probes for the Quantification of RNA by Real-Time qPCR.
2026cites this paper
Neutralizing human monoclonal antibodies to poliovirus map to the receptor binding site
2026cites this paper
Structural evolution of the selectivity clamp confers ADPR-PP specificity in Namat, a phage nicotinamide ADP-ribose transferase
2026cites this paper
Cross‐Feeding of Carbon and Nitrogen Between Aquificales and Thermus in Hot Springs
2026influential citation
Diacylglycerol kinases: Molecular mechanism of cellular and physiological functions.
2026cites this paper
Evaluating the Immunogenicity and Protective Efficacy of a Novel Vaccine Candidate Against Salmonella in Poultry
2026cites this paper
The pathway-independent positive allosteric modulator C1 allows for the identification of active Y4 receptor relevant positions
2026cites this paper
Repurposing approved drugs as potential inhibitors of Mycobacterium tuberculosis ClpP: Structure-based virtual screening, molecular dynamics, and in vitro evaluation.
2026cites this paper
Precision A3G Base Editors for Disease Modeling and Correction.
2026cites this paper
Rationality-informed protein evolution enables enhancement of PET hydrolase activity.
2026cites this paper
Characterizing p53 structural insights of variants in vertebrates that interfere its regulatory interaction with Mdm2.
2026cites this paper
Gene exchange between Neisseria meningitidis and Neisseria gonorrhoeae
2026cites this paper
Reclassification of the genus Dysgonomonas and description of Dysgonomonas reticulitermitis sp. nov. and Viscerimonas tarda gen. nov., sp. nov. from the gut of the subterranean termite Reticulitermes speratus
2026cites this paper
Deciphering Dark Proteome of Grapevine Geminivirus: Molecular Insights and Evolutionary Tactics of a Rising Viral Threat in Grapevine and Global Agriculture
2026cites this paper
Structural insights into substrate binding, domain swapping and heat resistance of a hyperthermostable archaeal AIR synthetase.
2026cites this paper
Hetaerina giselae sp. n. (Odonata: Calopterygidae) from Southeastern Brazil, with larval and adult diagnosis
2026cites this paper
Amplicon sequencing with Oxford nanopore technologies as a diagnostic alternative for small ruminant lentiviruses in sheep
2026cites this paper
The terminal heme synthetic enzyme, Coproheme Decarboxylase, negatively regulates heme uptake in Mycobacterium tuberculosis.
2026cites this paper
Molecular characterization and in vitro bacterial aggregating properties of tandem-repeat galectin-9 gene from sevenband grouper (Hyporthodus septemfasciatus)
2026cites this paper
PlasticEnz: An integrated database and screening tool combining homology and machine learning to identify plastic-degrading enzymes in meta-omics datasets
2026influential citation
Heterologous betacoronavirus spike immunization in nonhuman primates elicits cross-reactive antibodies that neutralize both sarbeco- and merbecoviruses
2026cites this paper
Genomic and Functional Characterization of Lytic Tlsvirus Bacteriophages Targeting Salmonella Infantis Isolated from Poultry Farms in Ecuador
2026influential citation
Evidence incriminating Anopheles maculatus as a potential vector of Plasmodium knowlesi and Plasmodium inui
2026cites this paper
Molecular characterization of recessively inherited ataxic and neuropathic disorders in consanguineous Pakistani families
2026cites this paper
Comparison of multiple cefiderocol susceptibility testing methods against genomic determinants of resistance in blaNDM carbapenemase producing Enterobacterales
2026influential citation
Multistep receptor binding of the hepatitis B virus preS1 domain
2026cites this paper
SAGA1 and SAGA2 localize the starch sheath to the pyrenoid in Chlamydomonas reinhardtii
2026cites this paper
Evaluation and grading of binding capability of target proteins to oligonucleotide libraries: A critical factor to drive effective aptamer screening
2026cites this paper
Characterization of an α-glucan-binding module from Flavobacterium johnsoniae as a founding member of carbohydrate-binding module family XXX
2026cites this paper
Fully Tunable Phosphorylation of RPS6A Ensures the Successful Development of Arabidopsis Seedlings.
2026cites this paper
Sequence and chemical specificity define the functional landscape of intrinsically disordered regions
2026cites this paper
Higher order repeat structures reflect diverging evolutionary paths in maize centromeres and knobs
2026cites this paper
Novel universal domain-centric method for protein classification
2026influential citation
NanoDel: Identification of large-scale mitochondrial DNA deletions using long-read sequencing
2026cites this paper
Redefining the role of the Plasmodium heme detoxification protein: From hemozoin formation to mitochondrial protein synthesis
2026cites this paper
RegEvol: detection of directional selection in regulatory sequences through phenotypic predictions and phenotype-to-fitness functions
2026cites this paper
AI-assisted protein design to rapidly convert antibody sequences to intrabodies targeting diverse peptides and histone modifications
2026cites this paper
AMMONIUM TRANSPORTER1.1 (AMT1.1) variation contributes to feedback inhibition of ammonium uptake that differs between domesticated and wild tomato species
2026cites this paper
Perturbing H-NS function reveals roles in restricting virulence heterogeneity and pathogen adaptation
2026cites this paper
Elucidating HER2-directed chimeric antigen receptor (CAR) activation mechanism using homology modeling and all-atom molecular dynamics simulation
2026cites this paper
Crystal structure of the V. parahaemolyticus ToxS dimer suggests a model for bile-induced ToxRS activation
2026cites this paper
RNA-triggered Cas12a3 cleaves tRNA tails to execute bacterial immunity
2026cites this paper
Virgibacillus aidingensis sp. nov., a bacterium isolated from a hypersaline sediment in Xinjiang Province, North-west China
2026cites this paper
In situ molecular organization and heterogeneity of the Legionella Dot/Icm T4SS.
2026cites this paper
Targeting the histone-fold dimerization interface of oocyst rupture proteins from Plasmodium berghei for antimalarial inhibitor discovery.
2026cites this paper
Network of Interactions between the Tumor Necrosis Factor Superfamily Members and Small S100 Proteins
2026cites this paper
Phosphofructokinase in Glycolysis: Bridging Enzymology and Cell Biology.
2026cites this paper
Core Transcriptional Plasticity Pave the Way for Fish to Succeed in a High‐CO2 World
2026influential citation
Designing AAV Capsid Protein with viability-guided Diffusion Model
2026influential citation
Quantitative Deciphering of Mammalian Histamine Receptors through Mathematical Genomics
2026cites this paper
The Ralstonia solanacearum E3 Ligase Effector RipV1 Targets Plant U‐Box Domain‐Containing Receptor‐Like Cytoplasmic Kinases That Negatively Regulate Immunity in Nicotiana benthamiana
2026cites this paper
A nowhere-to-hide mechanism ensures complete piRNA-directed DNA methylation
2026influential citation
Free Energy of Collagen-Mimetic Peptide Dimerization and Implications for Fibrillization.
2026cites this paper
The retaining kdo transferase that synthesizes Escherichia coli K13 capsule is deeply divergent from structurally homologous enzymes
2026cites this paper
Sequence constraints predispose Class D GPCRs to follow an atypical activation mechanism
2026cites this paper
A prevalent disease-associated SNP in the human ID3 gene regulates E-protein activity and cellular proliferation
2026cites this paper
Rab6 mediates lysosome biogenesis to regulate phagocytosis and bacterial clearance in the Chinese mitten crab Eriocheir sinensis.
2026cites this paper
Studying the long-term adaptation of Haloferax volcanii to low salt conditions: transcriptomic and genetic analyses
2026cites this paper
A comprehensive systematic literature review of multiple sequence alignment algorithms
2026cites this paper
Designing a Multi‐Epitope Vaccine Against NOTCH1 and NOTCH4: A Computational Approach for Triple‐Negative Breast Cancer
2026cites this paper
Structural Destabilization of FRMD3 by a FERM Domain Mutation Causes Hypomyelinating Disease via Oligodendrocyte Dysfunction.
2026cites this paper
Evolved populations of Listeria monocytogenes related to biofilm formation and biocide stress in the context of food production environment niches
2026cites this paper
Reduced levels of inositol hexakisphosphate kinase (IP6K) impair life-cycle transitions and the intracellular development of Trypanosoma cruzi within human cardiomyocytes
2026cites this paper
Comprehensive genotyping and taxonomic analysis uncovers extensive distribution of intermediate Leptospira species in Colombia
2026influential citation
Consensus-based computational mapping of NS1 antigenic determinants in dengue and Zika viruses to improve diagnostic specificity
2026cites this paper
Whole-genome sequencing of Bacillus pacificus B630 isolated from rice that produces biofilms
2026cites this paper
UBE4B Mediates Mitophagy via NIPSNAP1 Ubiquitination and NDP52 Recruitment
2026cites this paper
Protein-protein interactions are a major source of epistasis in genetic interaction networks.
2026cites this paper