Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole genome sequencing data for 809 individuals from 233 primate species, and identified 4.3 million common protein-altering variants with orthologs in human. We show that these variants can be inferred to have non-deleterious effects in human based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases. One Sentence Summary Deep learning classifier trained on 4.3 million common primate missense variants predicts variant pathogenicity in humans.
The landscape of tolerated genetic variation in humans and primates
Hong Gao,T. Hamp,Jeffrey Ede,J. Schraiber,J. McRae,M. Singer-Berk,Yanshen Yang,Anastasia S D Dietrich,P. Fiziev,L. Kuderna,Laksshman Sundaram,Yibing Wu,Aashish N. Adhikari,Yair Field,Chen Chen,S. Batzoglou,F. Aguet,G. Lemire,Rebecca M Reimers,D. Balick,Mareike C. Janiak,Martin Kuhlwilm,Joseph D. Orkin,S. Manu,A. Valenzuela,Juraj Bergman,Marjolaine Rouselle,F. E. Silva,L. Agueda,J. Blanc,M. Gut,D. D. Vries,I. Goodhead,R. A. Harris,M. Raveendran,A. Jensen,I. Chuma,Julie E. Horvath,C. Hvilsom,David Juan,Peter Frandsen,F. Melo,F. Bertuol,Hazel Byrne,I. Sampaio,I. Farias,João Valsecchi do Amaral,M. Messias,Maria N. F. Silva,Mihir Trivedi,R. Rossi,T. Hrbek,N. Andriaholinirina,C. Rabarivola,Alphonse Zaramody,C. Jolly,J. Phillips-Conroy,Gregory K Wilkerson,C. Abee,J. Simmons,E. Fernández‐Duque,ee Kanthaswamy,F. Shiferaw,Dongdong Wu,Long Zhou,Yong Shao,Guojie Zhang,J. Keyyu,S. Knauf,M. Le,Esther Lizano,S. Merker,A. Navarro,Thomas Batallion,T. Nadler,C. Khor,J. Lee,Patrick Tan,W. K. Lim,A. Kitchener,D. Zinner,I. Gut,A. Melin,K. Guschanski,M. Schierup,R. Beck,G. Umapathy,C. Roos,J. Boubli,M. Lek,S. Sunyaev,Anne O’Donnell,H. Rehm,Jinbo Xu,J. Rogers,Tomàs Marquès-Bonet,K. Farh
Published 2023 in bioRxiv
ABSTRACT
PUBLICATION RECORD
- Publication year
2023
- Venue
bioRxiv
- Publication date
2023-05-02
- Fields of study
Biology, Medicine
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.