Computational prediction of cell type-specific, in-vivo transcription factor binding sites is still one of the central challenges in regulatory genomics, and a variety of approaches has been proposed for this purpose. Here, we present our approach that earned a shared first rank in the “ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge” in 2017. This approach employs an extensive set of features derived from chromatin accessibility, binding motifs, gene expression, sequence and annotation to train classifiers using a supervised, discriminative learning principle. Two further key aspects of this approach is learning classifier parameters in an iterative training procedure that successively adds additional negative examples to the training set, and creating an ensemble prediction by averaging over classifiers obtained for different training cell types. In post-challenge analyses, we benchmark the influence of different feature sets and find that chromatin accessiblity and binding motifs are sufficient to yield state-of-the-art performance for in-vivo binding site predictions. We also show that the iterative training procedure and the ensemble prediction are pivotal for the final prediction performance. To make predictions of this approach readily accessible, we predict 682 peak lists for a total of 31 transcription factors in 22 primary cell types and tissues, which are available for download at http://www.synapse.Org/#!Synapse:syn11526239, and we demonstrate that these predictions may help to yield biological conclusions.
Learning from mistakes: Accurate prediction of cell type-specific transcription factor binding
J. Keilwagen,S. Posch,Jan Grau
Published 2017 in bioRxiv
ABSTRACT
PUBLICATION RECORD
- Publication year
2017
- Venue
bioRxiv
- Publication date
2017-12-06
- Fields of study
Biology, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-63 of 63 references · Page 1 of 1
CITED BY
Showing 1-8 of 8 citing papers · Page 1 of 1