The explosive growth of regulatory hypotheses from single-cell datasets demands accurate prioritization of hypotheses for in vivo validation. However, current computational methods emphasize overall accuracy in regulatory network reconstruction rather than prioritizing a limited set of causal transcription factors (TFs) that can be feasibly tested. We developed Haystack, a hybrid computational-biological algorithm that combines active learning and the concept of optimal transport theory to nominate and validate high-confidence causal hypotheses. Our novel approach efficiently identifies and prioritizes transient but causally-active TFs in cell lineages. We applied Haystack to single-cell observations, guiding efficient and cost-effective in vivo validations that reveal causal mechanisms of cell differentiation in Drosophila gut and blood lineages. Notably, all the TFs shortlisted for the final, imaging-based assays were validated as drivers of differentiation. Haystack’s hypothesis-prioritization approach will be crucial for validating concrete discoveries from the increasingly vast collection of low-confidence hypotheses from single-cell transcriptomics.
Prioritizing transcription factor perturbations from single-cell transcriptomics
Rohit Singh,Joshua Shing Shun Li,S. G. Tattikota,Yifang Liu,Jun Xu,Yanhui Hu,N. Perrimon,Bonnie Berger
Published 2023 in bioRxiv
ABSTRACT
PUBLICATION RECORD
- Publication year
2023
- Venue
bioRxiv
- Publication date
2023-02-07
- Fields of study
Biology, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-59 of 59 references · Page 1 of 1
CITED BY
Showing 1-5 of 5 citing papers · Page 1 of 1