Multi-class protein fold recognition using support vector machines and neural networks

Published 2001 in Bioinform.

ABSTRACT

MOTIVATION Protein fold recognition is an important approach to structure discovery without relying on sequence similarity. We study this approach with new multi-class classification methods and examined many issues important for a practical recognition system. RESULTS Most current discriminative methods for protein fold prediction use the one-against-others method, which has the well-known 'False Positives' problem. We investigated two new methods: the unique one-against-others and the all-against-all methods. Both improve prediction accuracy by 14-110% on a dataset containing 27 SCOP folds. We used the Support Vector Machine (SVM) and the Neural Network (NN) learning methods as base classifiers. SVMs converges fast and leads to high accuracy. When scores of multiple parameter datasets are combined, majority voting reduces noise and increases recognition accuracy. We examined many issues involved with large number of classes, including dependencies of prediction accuracy on the number of folds and on the number of representatives in a fold. Overall, recognition systems achieve 56% fold prediction accuracy on a protein test dataset, where most of the proteins have below 25% sequence identity with the proteins used in training.

PUBLICATION RECORD

Publication year
2001
Venue
Bioinform.
Publication date
2001-04-01
Fields of study
Biology, Medicine, Computer Science
Identifiers
DOI 10.1093/bioinformatics/17.4.349 PMID 11301304
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar, PubMed

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Biological sequence analysis
2003cited by this paper
Assigning genomic sequences to CATH
2000cited by this paper
Assessing the accuracy of prediction algorithms for classification: an overview
2000cited by this paper
Bioinformatics - The Machine Learning Approach
2000cited by this paper
The Nature of Statistical Learning Theory
2000cited by this paper
Knowledge-based analysis of microarray gene expression data by using support vector machines.
2000cited by this paper
Recognition of a protein fold in the context of the SCOP classification
1999influential reference
Using the Fisher Kernel Method to Detect Remote Protein Homologies
1999cited by this paper
GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences.
1999cited by this paper
Protein folds and families: sequence and structure alignments
1999cited by this paper
Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification.
1999influential reference
Advances in kernel methods: support vector learning
1999cited by this paper
Multi-Class Support Vector Machines
1998cited by this paper
Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods.
1998cited by this paper
Making large scale SVM learning practical
1998cited by this paper
Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships.
1998cited by this paper
SCOP: a structural classification of proteins database
1998cited by this paper
An improved training algorithm for support vector machines
1997cited by this paper
Prediction of protein folding class using global description of amino acid sequence.
1995cited by this paper
Predicting Protein Folding Classes without Overly Relying on Homology
1995cited by this paper
Prediction of protein structural classes.
1995cited by this paper
Enlarged representative set of protein structures
1994cited by this paper
Prediction of protein secondary structure at better than 70% accuracy.
1993cited by this paper
Selection of representative protein data sets
1992cited by this paper

CITED BY

A global and local unified feature selection algorithm based on hierarchical structure constraints
2025cites this paper
Power Transformer Fault Diagnosis Based on Multi Class SVM and IPSO
2025cites this paper
Uncertain multi-conceptual information acquisition and fusion for hierarchical classification
2025cites this paper
GCDTA: Graph-attention-assisted contrastive learning for drug-target affinity prediction
2025cites this paper
Hierarchical feature selection via joint local label enhancement and neighborhood label distribution correlation
2025cites this paper
Musical Insights into Protein Sequences and Functions via NMR Data
2025cites this paper
Longitudinal Risk Analysis of Second Primary Cancer after Curative Treatment in Patients with Rectal Cancer
2024cites this paper
Incremental feature selection for large-scale hierarchical classification with the arrival of new samples
2024cites this paper
Sparse Feature-Persistent Hierarchical Classification
2024cites this paper
Kinematic Analysis of Human Gait in Healthy Young Adults Using IMU Sensors: Exploring Relevant Machine Learning Features for Clinical Applications
2024cites this paper
Hierarchical feature selection based on neighborhood interclass spacing from fine to coarse
2024cites this paper
Online hierarchical streaming feature selection based on adaptive neighborhood rough set
2024cites this paper
Comparison of machine learning approaches for positive airway pressure adherence prediction in a veteran cohort
2024cites this paper
Accurate Prediction of Lysine Methylation Sites Using Evolutionary and Structural-Based Information
2024cites this paper
DMTFS-FO: Dynamic multi-task feature selection based on flexible loss and orthogonal constraint
2024cites this paper
PRFold-TNN: Protein Fold Recognition With an Ensemble Feature Selection Method Using PageRank Algorithm Based on Transformer
2024influential citation
An incremental approach to hierarchical feature selection by applying fuzzy rough set technique
2023cites this paper
Prediction of landslide induced debris’ severity using machine learning 1 algorithms: a case of South Korea
2023cites this paper
Snacks: a fast large-scale kernel SVM solver
2023cites this paper
Disto-TRP: An approach for identifying transient receptor potential (TRP) channels using structural information generated by AlphaFold.
2023cites this paper
FS-MGKC: Feature selection based on structural manifold learning with multi-granularity knowledge coordination
2023cites this paper
Feature selection via maximizing inter-class independence and minimizing intra-class redundancy for hierarchical classification
2023cites this paper
A fuzzy rough set approach to hierarchical feature selection based on Hausdorff distance
2022cites this paper
An Analysis of Protein Language Model Embeddings for Fold Prediction
2022cites this paper
Single-Stranded DNA Binding Proteins and Their Identification Using Machine Learning-Based Approaches
2022cites this paper
ASRmiRNA: Abiotic Stress-Responsive miRNA Prediction in Plants by Using Machine Learning Algorithms with Pseudo K-Tuple Nucleotide Compositional Features
2022cites this paper
Optimal Neighborhood Multiple Kernel Clustering With Adaptive Local Kernels
2022cites this paper
ToxinMI: improving peptide toxicity prediction by fusing multimodal information based on mutual information
2022cites this paper
Hierarchical metric learning with intra-level and inter-level regularization
2022cites this paper
Prediction of Heart Disease using Machine Learning Algorithm: Support Vector Machine
2022cites this paper
A novel hierarchical feature selection method based on large margin nearest neighbor learning
2022cites this paper
Screening gene signatures for clinical response subtypes of lung transplantation
2022cites this paper
An extended physics informed neural network for preliminary analysis of parametric optimal control problems
2021cites this paper
Hierarchical classification of data with long-tailed distributions via global and local granulation
2021cites this paper
MLDH-Fold: Protein fold recognition based on multi-view low-rank modeling
2021cites this paper
A New Method for Binary Classification of Proteins with Machine Learning
2021cites this paper
Artificial Intelligence-Based Ensemble Model for Rapid Prediction of Heart Disease
2021cites this paper
ActTRANS: Functional classification in active transport proteins based on transfer learning and contextual representations
2021cites this paper
EnZymClass: Substrate specificity prediction tool of plant acyl-ACP thioesterases based on Ensemble Learning
2021cites this paper
MRMD-palm: A novel method for the identification of palmitoylated protein
2021cites this paper
ProteinPrompt: a webserver for predicting protein–protein interactions
2021cites this paper
Predicting Students’ Problem Solving Performance using Support Vector Machine
2021cites this paper
Discovering critical proteins in the learning process in a Down Syndrome model of mouse through machine learning
2021cites this paper
A Recursive Regularization Based Feature Selection Framework for Hierarchical Classification
2021cites this paper
Robust hierarchical feature selection with a capped ℓ2-norm
2021cites this paper
Determining Protein–Protein Interaction Using Support Vector Machine: A Review
2021cites this paper
Structural protein fold recognition based on secondary structure and evolutionary information using machine learning algorithms
2021cites this paper
A protein structural study based on the centrality analysis of protein sequence feature networks
2021cites this paper
Cost-sensitive hierarchical classification via multi-scale information entropy for data with an imbalanced distribution
2021cites this paper
Development of a TSR-Based Method for Protein 3-D Structural Comparison With Its Applications to Protein Classification and Motif Discovery
2021cites this paper
DeepDTAF: a deep learning method to predict protein-ligand binding affinity
2021cites this paper
ASFold-DNN: Protein Fold Recognition Based on Evolutionary Features With Variable Parameters Using Full Connected Neural Network
2021influential citation
Hierarchical feature selection with multi-granularity clustering structure
2021cites this paper
A Novel Amino Acid Properties Selection Method for Protein Fold Classification.
2020cites this paper
Validation of Neural Network Predictions for the Outcome of Refractive Surgery for Myopia
2020cites this paper
Protein Structure Prediction Using Robust Principal Component Analysis and Support Vector Machine
2020cites this paper
Hierarchical classification with multi-path selection based on granular computing
2020cites this paper
Transfer Learning for Protein Structure Classification and Function Inference at Low Resolution
2020cites this paper
Microarray Data Classification to Detect Cancer Cells by Using Discrete Wavelet Transform and Combining Classifiers Methods
2020cites this paper
MSclassifier: median-supplement model-based classification tool for automated knowledge discovery
2020cites this paper
A portable, low-cost and sensor-based detector on sweetness and firmness grades of kiwifruit
2020cites this paper
Classification: a Tour of the Classics
2020cites this paper
Recognition of Mitochondrial Proteins in Plasmodium Based on the Tripeptide Composition
2020cites this paper
Kernelized fuzzy rough sets based online streaming feature selection for large-scale hierarchical classification
2020cites this paper
Protein Fold Recognition From Sequences Using Convolutional and Recurrent Neural Networks
2020cites this paper
SSCpred: Single-Sequence-Based Protein Contact Prediction Using Deep Fully Convolutional Network
2020cites this paper
FoldRec-C2C: protein fold recognition by combining cluster-to-cluster model and protein similarity network
2020cites this paper
Accelerating Physics-Informed Neural Network Training with Prior Dictionaries
2020cites this paper
Protein Fold Prediction for Protein Sequences of Low Identity Based on Evolutionary and Spatial Features Using Random Forest Algorithm
2020cites this paper
Solving the protein folding problem in hydrophobic-polar model using deep reinforcement learning
2020cites this paper
Comparative Survey of Machine Learning Techniques for Prediction of Parkinson's Disease
2020cites this paper
Identification and Analysis of Glioblastoma Biomarkers Based on Single Cell Sequencing
2020cites this paper
Cost-sensitive hierarchical classification for imbalance classes
2020cites this paper
Predicting ATP-Binding Cassette Transporters Using the Random Forest Method
2020cites this paper
Protein Fold Recognition Based on Auto-Weighted Multi-View Graph Embedding Learning Model
2020cites this paper
Centered kernel alignment inspired fuzzy support vector machine
2020cites this paper
Protein Fold Recognition by Combining Support Vector Machines and Pairwise Sequence Similarity Scores
2020cites this paper
A parallel classification framework for protein fold recognition
2020cites this paper
Robust hierarchical feature selection driven by data and knowledge
2020cites this paper
Hybrid Data-Driven and Physics-Based Modeling for Gas Turbine Prescriptive Analytics
2020cites this paper
Classification of Diabetic Retinopathy using shallow learning approach
2020cites this paper
An Enhanced Protein Fold Recognition for Low Similarity Datasets Using Convolutional and Skip-Gram Features With Deep Neural Network
2020influential citation
Capreomycin resistance prediction in two species of Mycobacterium using a stacked ensemble method
2019cites this paper
AOPs-SVM: A Sequence-Based Classifier of Antioxidant Proteins Using a Support Vector Machine
2019cites this paper
Automatic speech patterns recognition of commands using SVM and PSO
2019cites this paper
A Comparative Study of Machine Learning Techniques for Emotion Recognition
2019cites this paper
Sparse discriminative least squares regression model
2019cites this paper
Logical analysis of multiclass data with relaxed patterns
2019cites this paper
A $k$ -Nearest Neighbor Algorithm-Based Near Category Support Vector Machine Method for Event Identification of $\varphi$ -OTDR
2019cites this paper
Antimicrobial Resistance Prediction for Gram-Negative Bacteria via Game Theory-Based Feature Evaluation
2019cites this paper
Development of species specific putative miRNA and its target prediction tool in wheat (Triticum aestivum L.)
2019cites this paper
Free alignment classification of dikarya fungi using some machine learning methods
2019cites this paper
A novel fusion based on the evolutionary features for protein fold recognition using support vector machines
2019cites this paper
Decoding the Structural Keywords in Protein Structure Universe
2019cites this paper
Hierarchical feature extraction based on discriminant analysis
2019cites this paper
Protein fold recognition based on multi-view modeling
2019cites this paper
Interpretation of Phase Boundary Fluctuation Spectra in Biological Membranes with Nanoscale Organization
2019cites this paper
A study on separation of the protein structural types in amino acid sequence feature spaces
2019cites this paper
Development of sustainable multivariate analytical approach for smart factory
2019cites this paper
Discriminative margin-sensitive autoencoder for collective multi-view disease analysis
2019cites this paper