Significance Biological systems function through the interaction of numerous molecules influencing a variety of biochemical reactions. However, most biological systems are still only partially understood. This paper introduces GeneFishing, a method for “fishing out” candidate genes in a biological process. The method is “semisupervised” using a set of “bait” genes (i.e., ones previously known to be relevant to the same process). GeneFishing effectively combines modern and traditional statistical ideas for analyzing both big and small data. We applied this method to cholesterol-related genes and identified several interesting phenomena. GeneFishing has the potential for pointing to functional importance in known but poorly studied genes, and its underlying framework is broadly applicable inside and outside biology. Rapid advances in genomic technologies have led to a wealth of diverse data, from which novel discoveries can be gleaned through the application of robust statistical and computational methods. Here, we describe GeneFishing, a semisupervised computational approach to reconstruct context-specific portraits of biological processes by leveraging gene–gene coexpression information. GeneFishing incorporates multiple high-dimensional statistical ideas, including dimensionality reduction, clustering, subsampling, and results aggregation, to produce robust results. To illustrate the power of our method, we applied it using 21 genes involved in cholesterol metabolism as “bait” to “fish out” (or identify) genes not previously identified as being connected to cholesterol metabolism. Using simulation and real datasets, we found that the results obtained through GeneFishing were more interesting for our study than those provided by related gene prioritization methods. In particular, application of GeneFishing to the GTEx liver RNA sequencing (RNAseq) data not only reidentified many known cholesterol-related genes, but also pointed to glyoxalase I (GLO1) as a gene implicated in cholesterol metabolism. In a follow-up experiment, we found that GLO1 knockdown in human hepatoma cell lines increased levels of cellular cholesterol ester, validating a role for GLO1 in cholesterol metabolism. In addition, we performed pantissue analysis by applying GeneFishing on various tissues and identified many potential tissue-specific cholesterol metabolism-related genes. GeneFishing appears to be a powerful tool for identifying related components of complex biological systems and may be used across a wide range of applications.
GeneFishing to reconstruct context specific portraits of biological processes
Ke Liu,E. Theusch,Yun Zhou,Tal Ashuach,A. Dosé,P. Bickel,M. Medina,Haiyan Huang
Published 2019 in Proceedings of the National Academy of Sciences of the United States of America
ABSTRACT
PUBLICATION RECORD
- Publication year
2019
- Venue
Proceedings of the National Academy of Sciences of the United States of America
- Publication date
2019-09-04
- Fields of study
Biology, Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-31 of 31 references · Page 1 of 1
CITED BY
Showing 1-8 of 8 citing papers · Page 1 of 1