Automated recognition of functional interactions between compounds and proteins/genes from biomedical literature is essential for drug discovery, knowledge understanding, and basic clinical research. Although several computational methods have achieved competitive performances in extracting these relations, there is significant room for improvement in fully capturing complex semantic and syntactic information within sentences. We herein present a novel parallel model to improve chemical-protein interaction (CPI) extraction. Specifically, the model consists of ChemicalBERT and Attention Guided Graph Convolutional Networks (AGGCN) two parallel components. We pre-train BERT on large-scale chemical interaction corpora and re-define it as ChemicalBERT to generate high-quality contextual representation, and employ AGGCN to capture syntactic graph information of the sentence. Finally, the contextual representation and syntactic graph representation are merged into a fusion layer and then fed into the fully-connected softmax layer to extract CPIs. We evaluate our proposed model on the ChemProt corpus, which is the benchmark corpus of this domain. We achieve state-of-the-art results for the CPI extraction with a micro-averaged F1-score of 80.21%. To further demonstrate the efficacy of the proposed model, we have also conducted experiments on the DDIExtraction 2013 corpus and obtained a micro-averaged F1-score of 82.88%, which is also the highest score compared to the existing models. Experimental results show that our proposed model can adequately capture semantic and syntactic information by parallelly extracting sentence features from different views. The code is available at https://github.com/ql-bio/CPR extraction.
Chemical-protein Interaction Extraction via ChemicalBERT and Attention Guided Graph Convolutional Networks in Parallel
Published 2020 in IEEE International Conference on Bioinformatics and Biomedicine
ABSTRACT
PUBLICATION RECORD
- Publication year
2020
- Venue
IEEE International Conference on Bioinformatics and Biomedicine
- Publication date
2020-12-16
- Fields of study
Chemistry, Computer Science, Biology
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-32 of 32 references · Page 1 of 1
CITED BY
Showing 1-7 of 7 citing papers · Page 1 of 1