Illuminating structural proteins in viral “dark matter” with metaproteomics

Jennifer R. Brum,J. Ignacio-Espinoza,Eun-Hae Kim,G. Trubl,Robert M. Jones,S. Roux,N. Verberkmoes,Virginia I. Rich,Matthew B. Sullivan

Published 2016 in Proceedings of the National Academy of Sciences of the United States of America

ABSTRACT

Significance Marine viruses are abundant and have substantial ecosystem impacts, yet their study is hampered by the dominance of unannotated viral genes. Here, we use metaproteomics and metagenomics to examine virion-associated proteins in marine viral communities, providing tentative functions for 677,000 viral genomic sequences and the majority of previously unknown virion-associated proteins in these samples. The five most abundant protein groups comprised 67% of the metaproteomes and were tentatively identified as capsid proteins of predominantly unknown viruses, all of which putatively contain a protein fold that may be the most abundant biological structure on Earth. This methodological approach is thus shown to be a powerful way to increase our knowledge of the most numerous biological entities on the planet. Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional “viral dark matter.” Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional dark matter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore, four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world’s oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Together, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-64 of 64 references · Page 1 of 1

CITED BY

Showing 1-83 of 83 citing papers · Page 1 of 1