Feature Selection for the Prediction of Translation Initiation Sites

Published 2016 in Genomics, Proteomics & Bioinformatics

ABSTRACT

Translation initiation sites (TISs) are important signals in cDNA sequences. In many previous attempts to predict TISs in cDNA sequences, three major factors affect the prediction performance: the nature of the cDNA sequence sets, the relevant features selected, and the classification methods used. In this paper, we examine different approaches to select and integrate relevant features for TIS prediction. The top selected significant features include the features from the position weight matrix and the propensity matrix, the number of nucleotide C in the sequence downstream ATG, the number of downstream stop codons, the number of upstream ATGs, and the number of some amino acids, such as amino acids A and D. With the numerical data generated from these features, different classification methods, including decision tree, naïve Bayes, and support vector machine, were applied to three independent sequence sets. The identified significant features were found to be biologically meaningful, while the experiments showed promising results.

PUBLICATION RECORD

Publication year
2016
Venue
Genomics, Proteomics & Bioinformatics
Publication date
2016-11-28
Fields of study
Biology, Medicine, Computer Science
Identifiers
DOI 10.1016/S1672-0229(05)03012-3 PMID 16393144 PMCID 5172590
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar, PubMed

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Translation initiation sites prediction with mixture Gaussian models in human cDNA sequences
2005influential reference
A Class of Edit Kernels for SVMs to Predict Translation Initiation Sites in Eukaryotic mRNAs
2004influential reference
Using Amino Acid Patterns to Accurately Predict Translation Initiation Sites
2004cited by this paper
Comparison of computational methods for identifying translation initiation sites in EST data
2004cited by this paper
Recognizing translation initiation sites of eukaryotic genes based on the cooperatively scanning model
2003cited by this paper
Recognition of Translation Initiation Sites of Eukaryotic Genes Based on an EM Algorithm
2003cited by this paper
21. UniGene: A Unified View of the Transcriptome
2003cited by this paper
Data Mining Tools for Biological Sequences
2003cited by this paper
Using feature generation and feature selection for accurate prediction of translation initiation sites.
2002influential reference
Pushing the limits of the scanning mechanism for initiation of translation
2002cited by this paper
Translation initiation start prediction in human cDNAs with high accuracy
2002influential reference
Presence of ATG triplets in 5' untranslated regions of eukaryotic cDNAs correlates with a 'weak' context of the start codon
2001cited by this paper
Prediction whether a human cDNA sequence contains initiation codon by combining statistical information and similarity with protein sequences
2000cited by this paper
Prediction of human transnational initiation sites using a multiple neural network approach
2000influential reference
Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites
2000influential reference
A Tutorial on Support Vector Machines for Pattern Recognition
1998cited by this paper
Assessing protein coding region integrity in cDNA sequencing projects
1998influential reference
Detecting non-adjoining correlations with signals in DNA
1998cited by this paper
Neural Network Prediction of Translation Initiation Sites in Eukaryotes: Perspectives for EST and Genome Analysis
1997influential reference
A method for identifying splice sites and translational start sites in eukaryotic mRNA
1997cited by this paper
Interpreting cDNA sequences: Some insights from studies on translation
1996cited by this paper
Chi2: feature selection and discretization of numeric attributes
1995cited by this paper
The Gene Identification Problem: An Overview for Developers
1995cited by this paper
Estimating Attributes: Analysis and Extensions of RELIEF
1994cited by this paper
Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning
1993cited by this paper
A consideration of alternative models for the initiation of translation in eukaryotes.
1992cited by this paper
C4.5: Programs for Machine Learning
1992cited by this paper
The Feature Selection Problem: Traditional Methods and a New Algorithm
1992cited by this paper
Consensus patterns in DNA.
1990cited by this paper
The scanning model for translation: an update
1989cited by this paper
tRNAi(met) functions in directing the scanning ribosome to the start site of translation.
1988cited by this paper
At least six nucleotides preceding the AUG initiator codon enhance translation in mammalian cells.
1987cited by this paper
An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs.
1987cited by this paper
Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes.
1986cited by this paper
On the predictive recognition of signal peptide sequences.
1985cited by this paper
Use of the 'Perceptron' algorithm to distinguish translational initiation sites in E. coli.
1982cited by this paper
How do eucaryotic ribosomes select initiation regions in messenger RNA?
1978cited by this paper

CITED BY

GSRNet, an adversarial training-based deep framework with multi-scale CNN and BiGRU for predicting genomic signals and regions
2023cites this paper
Mapping genomes by using bioinformatics data and tools
2021cites this paper
Nucleotide Substitution Models and Evolutionary Distances
2020cites this paper
Distance-Based Phylogenetic Methods
2020cites this paper
Bioinformatics and the Cell
2018cites this paper
Comparative analysis of protein-coding and long non-coding transcripts based on RNA sequence features
2018cites this paper
Fundamentals of Proteomics
2018cites this paper
Bioinformatics and Translation Elongation
2018cites this paper
Bioinformatics and Translation Initiation
2018cites this paper
Protein Isoelectric Point and Helicobacter pylori
2018cites this paper
Position weight matrix and Perceptron
2018cites this paper
Proteins Recognizing DNA: Structural Uniqueness and Versatility of DNA-Binding Domains in Stem Cell Transcription Factors
2017cites this paper
Functional motifs in Escherichia coli NC101
2013cites this paper
Recognition of Translation Initiation Sites in Arabidopsis Thaliana
2012cites this paper
Dragon TIS Spotter: an Arabidopsis-derived predictor of translation initiation sites in plants
2012cites this paper
Transition Initiation Sites ( TIS ) Recognition in DNA Sequence using Machine Learning
2012cites this paper
Position Weight Matrix, Gibbs Sampler, and the Associated Significance Tests in Motif Characterization and Prediction
2012cites this paper
Improvement in the prediction of the translation initiation site through balancing methods, inclusion of acquired knowledge and addition of features to sequences of mRNA
2011cites this paper
Identification, Characterization and Evolution of Invertebrate Telomerase RNA
2011cites this paper
Identification and characterization of sea squirt telomerase reverse transcriptase.
2007cites this paper
Feature Mining and Integration for Improving the Prediction Accuracy of Translation Initiation Sites in Eukaryotic mRNAs
2006cites this paper
Pattern Recognition in Bioinformatics: An Introduction
2006cites this paper