Detecting protein variants by mass spectrometry: a comprehensive study in cancer cell-lines

J. Alfaro,A. Ignatchenko,V. Ignatchenko,A. Sinha,P. Boutros,T. Kislinger

Published 2017 in Genome Medicine

ABSTRACT

BackgroundOnco-proteogenomics aims to understand how changes in a cancer’s genome influences its proteome. One challenge in integrating these molecular data is the identification of aberrant protein products from mass-spectrometry (MS) datasets, as traditional proteomic analyses only identify proteins from a reference sequence database.MethodsWe established proteomic workflows to detect peptide variants within MS datasets. We used a combination of publicly available population variants (dbSNP and UniProt) and somatic variations in cancer (COSMIC) along with sample-specific genomic and transcriptomic data to examine proteome variation within and across 59 cancer cell-lines.ResultsWe developed a set of recommendations for the detection of variants using three search algorithms, a split target-decoy approach for FDR estimation, and multiple post-search filters. We examined 7.3 million unique variant tryptic peptides not found within any reference proteome and identified 4771 mutations corresponding to somatic and germline deviations from reference proteomes in 2200 genes among the NCI60 cell-line proteomes.ConclusionsWe discuss in detail the technical and computational challenges in identifying variant peptides by MS and show that uncovering these variants allows the identification of druggable mutations within important cancer genes.

PUBLICATION RECORD

Publication year
2017
Venue
Genome Medicine
Publication date
2017-07-18
Fields of study
Biology, Medicine
Identifiers
DOI 10.1186/s13073-017-0454-9 PMID 28716134 PMCID 5514513
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar, PubMed

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer.
2025cited by this paper
UniProt Protein Knowledgebase.
2017cited by this paper
MSFragger: ultrafast and comprehensive peptide identification in shotgun proteomics
2017cited by this paper
Building ProteomeTools based on a complete synthetic human proteome
2017cited by this paper
PGA: an R/Bioconductor package for identification of novel peptides using a customized database derived from RNA-Seq
2016cited by this paper
Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer.
2016cited by this paper
Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
2016cited by this paper
Proteogenomics for understanding oncology: recent advances and future prospects
2016cited by this paper
The Ensembl Variant Effect Predictor
2016influential reference
Proteogenomics connects somatic mutations to signaling in breast cancer
2016cited by this paper
MSProGene: integrative proteogenomics beyond six-frames and single nucleotide polymorphisms
2015cited by this paper
Proteogenomics from a bioinformatics angle: A growing field.
2015cited by this paper
DGIdb 2.0: mining clinically relevant drug–gene interactions
2015cited by this paper
An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer*
2015cited by this paper
Spatial genomic heterogeneity within localized, multifocal prostate cancer
2015cited by this paper
Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection
2015cited by this paper
A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing
2015cited by this paper
Human Proteomic Variation Revealed by Combining RNA-Seq Proteogenomics and Global Post-Translational Modification (G-PTM) Search Strategy
2015cited by this paper
PGTools: A Software Suite for Proteogenomic Data Analysis and Visualization.
2015cited by this paper
NextSearch: A Search Engine for Mass Spectrometry Data against a Compact Nucleotide Exon Graph.
2015cited by this paper
Erratum: Onco-proteogenomics: cancer proteomics joins forces with genomics
2015cited by this paper
Semi-supervised Learning Predicts Approximately One Third of the Alternative Splicing Isoforms as Functional Proteins.
2015cited by this paper
PPLine: An Automated Pipeline for SNP, SAP, and Splice Variant Detection in the Context of Proteogenomics.
2015cited by this paper
Universal database search tool for proteomics
2014cited by this paper
PROTEOFORMER: deep proteome coverage through ribosome profiling and MS integration
2014cited by this paper
Exome-driven characterization of the cancer cell lines at the proteome level: the NCI-60 case study.
2014cited by this paper
COSMIC: exploring the world's knowledge of somatic mutations in human cancer
2014cited by this paper
Proteogenomic characterization of human colon and rectal cancer
2014cited by this paper
A comprehensive transcriptional portrait of human cancer cell lines
2014influential reference
Onco-proteogenomics: cancer proteomics joins forces with genomics
2014cited by this paper
Proteogenomics: concepts, applications and computational strategies
2014influential reference
The coming age of complete, accurate, and ubiquitous proteomes.
2013cited by this paper
Low-Grade Fibromyxoid Sarcoma: Incidence, Treatment Strategy of Metastases, and Clinical Significance of the FUS Gene
2013cited by this paper
customProDB: an R package to generate customized protein databases from RNA-Seq data for proteomics search
2013cited by this paper
The exomes of the NCI-60 panel: a genomic resource for cancer biology and systems pharmacology.
2013cited by this paper
DGIdb - Mining the druggable genome
2013cited by this paper
Peppy: proteogenomic search software.
2013cited by this paper
Comet: An open‐source MS/MS sequence database search tool
2013cited by this paper
Proteoform: a single term describing protein complexity
2013cited by this paper
Global proteome analysis of the NCI-60 cell line panel.
2013cited by this paper
False discovery rates in spectral identification
2012cited by this paper
A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics*
2011cited by this paper
Proteogenomics to discover the full coding content of genomes: a computational perspective.
2010cited by this paper
MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification
2008influential reference
Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry
2007cited by this paper
TANDEM: matching proteins with tandem mass spectra.
2004cited by this paper
PRISM, a Generic Large Scale Proteomic Investigation Strategy for Mammals*S
2003cited by this paper
Probability-based validation of protein identifications using a modified SEQUEST algorithm.
2002cited by this paper
Shotgun identification of protein modifications from protein complexes and lens tissue
2002cited by this paper
dbSNP: the NCBI database of genetic variation
2001cited by this paper
Rapid Communications in Mass Spectrometry
year unknowncited by this paper

CITED BY

Identification of a novel selenium peptide from Oryzain α hydrolysate: Selenium-specific database establishment, sequence confirmation, and molecular interaction analysis
2026cites this paper
Uncovering dark mass in population proteomics: Pan-analysis of single amino acid polymorphism relevant to cognition and aging
2025cites this paper
Proteomic insights into paediatric cancer: Unravelling molecular signatures and therapeutic opportunities
2024cites this paper
Data‐Independent Acquisition and Label‐Free Quantification for Quantitative Proteomics Analysis of Human Cerebrospinal Fluid
2024cites this paper
Benchmarking the identification of a single degraded protein to explore optimal search strategies for ancient proteins
2024cites this paper
Assessment of Data-Independent Acquisition Mass Spectrometry (DIA-MS) for the Identification of Single Amino Acid Variants
2024cites this paper
Chemoproteogenomic stratification of the missense variant cysteinome
2024cites this paper
Proteogenomics analysis of human tissues using pangenomes
2024cites this paper
Quality control of variant peptides identified through proteogenomics- catching the (un)usual suspects
2023influential citation
Re-evaluating the impact of alternative RNA splicing on proteomic diversity
2023cites this paper
HLAProphet: Personalized allele-level quantification of the HLA proteins
2023cites this paper
Proteogenomics in Cancer: Then and Now.
2023cites this paper
Multi-omic stratification of the missense variant cysteinome
2023cites this paper
PgxSAVy: A tool for comprehensive evaluation of variant peptide quality in proteogenomics – catching the (un)usual suspects
2023influential citation
False discovery rate: the Achilles' heel of proteogenomics
2022cites this paper
Statistical methodology for ribosomal frameshift detection
2022cites this paper
An analysis of proteogenomics and how and when transcriptome-informed reduction of protein databases can enhance eukaryotic proteomics
2022cites this paper
A Statistical Detector for Ribosomal Frameshifts and Dual Encodings based on Ribosome Profiling
2022cites this paper
Cancer Conformational Landscape Shape Tumorigenesis.
2022cites this paper
Improvement of mutated peptide identification through MS/MS signals searching against the protein libraries generated from transcriptome and translatome
2022cites this paper
Uncovering the impacts of alternative splicing on the proteome with current omics techniques
2022cites this paper
Multiaspect Examinations of Possible Alternative Mappings of Identified Variant Peptides: A Case Study on the HEK293 Cell Line
2022cites this paper
dbPepVar: A Novel Cancer Proteogenomics Database
2022cites this paper
MALDI mass Spectrometry based proteomics for drug discovery & development.
2021cites this paper
Transcriptome-informed reduction of protein databases: an analysis of how and when proteogenomics enhances eukaryotic proteomics
2021cites this paper
The Road to Effective Cancer Immunotherapy—A Computational Perspective on Tumor Epitopes in Anti-Cancer Immunotherapy
2021cites this paper
Proteogenomic interrogation of cancer cell lines: an overview of the field
2021cites this paper
Preneoplastic Lesions Fimbria Early Diagnosis Markers Underlying Timeline Mechanisms at The Origin of Ovarian Cancer in BRAC1/2 Patients: Case Reports Based on Proteogenomic Study
2021cites this paper
Computational and Mass Spectrometry-Based Approach Identify Deleterious Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs) in JMJD6
2021cites this paper
Targeted Detection of SARS-CoV-2 Nucleocapsid Sequence Variants by Mass Spectrometric Analysis of Tryptic Peptides
2021cites this paper
Proteomic variations of esophageal squamous cell carcinoma revealed by combining RNA-seq proteogenomics and G-PTM search strategy
2020cites this paper
Integrative proteomics of prostate cancer
2020cites this paper
MinProtMaxVP: Generating a minimized number of protein variant sequences containing all possible variant peptides for proteogenomic analysis.
2020cites this paper
CusVarDB: A tool for building customized sample-specific variant protein database from next-generation sequencing datasets
2020cites this paper
Precursor Intensity-Based Label-Free Quantification Software Tools for Proteomic and Multi-Omic Analysis within the Galaxy Platform
2020cites this paper
Universal Spectrum Explorer: A standalone (web-)application for cross-resource spectrum comparison
2020cites this paper
Preneoplastic lesions fimbria pan-proteomic studies establish the fimbriectomy benefit for BRCA1/2 patients and identify early diagnosis markers of HGSC
2020cites this paper
Selecting Target Antigens for Cancer Vaccine Development
2020cites this paper
N-Glycoproteomics of Patient-Derived Xenografts: A Strategy to Discover Tumor-Associated Proteins in High-Grade Serous Ovarian Cancer.
2019cites this paper
Proteoinformatics and Agricultural Biotechnology Research: Applications and Challenges
2019cites this paper
Precision De Novo Peptide Sequencing Using Mirror Proteases of Ac-LysargiNase and Trypsin for Large-scale Proteomics*
2019cites this paper
Origins and clinical relevance of proteoforms in pediatric malignancies
2019cites this paper
Detection and verification of 2.3 million cancer mutations in NCI60 cancer cell lines with a cloud search engine.
2019cites this paper
Proteogenomics: From next-generation sequencing (NGS) and mass spectrometry-based proteomics to precision medicine.
2019cites this paper
Proteogenomics of Malignant Melanoma Cell Lines: The Effect of Stringency of Exome Data Filtering on Variant Peptide Identification in Shotgun Proteomics.
2018cites this paper
Cell-surface proteomics for the identification of novel therapeutic targets in cancer
2018cites this paper
Comprehensive characterization of the human proteome by multi-omic analyses
2018cites this paper
Detecting Protein Variants within Mass Spectrometry Datasets
2018influential citation
Clinical potential of mass spectrometry-based proteogenomics
2018cites this paper
Connecting Proteomics to Next‐Generation Sequencing: Proteogenomics and Its Current Applications in Biology
2018cites this paper
Single Amino Acid Variant Discovery in Small Numbers of Cells.
2018cites this paper
Large-Scale Reanalysis of Publicly Available HeLa Cell Proteomics Data in the Context of the Human Proteome Project.
2018cites this paper
ProteomeGenerator: A framework for comprehensive proteomics based on de novo transcriptome assembly and high-accuracy peptide mass spectral matching
2017cites this paper
Genomic technologies—from tools to therapies
2017cites this paper
Characterization of Peptides and Proteins Associated with Bacterial Proliferation and Bird’s Nest Sample Matrix
year unknowncites this paper
Methodological insights on proteogenomic approaches to enhance proteomics
year unknowncites this paper
Benchmarking the identiﬁcation of a single degraded protein to explore optimal search strategies for ancient proteins
year unknowncites this paper