Exploring the boundaries: gene and protein identification in biomedical text

J. Finkel,Shipra Dingare,Christopher D. Manning,M. Nissim,Beatrice Alex,Claire Grover

Published 2005 in BMC Bioinformatics

ABSTRACT

BackgroundGood automatic information extraction tools offer hope for automatic processing of the exploding biomedical literature, and successful named entity recognition is a key component for such tools.MethodsWe present a maximum-entropy based system incorporating a diverse set of features for identifying gene and protein names in biomedical abstracts.ResultsThis system was entered in the BioCreative comparative evaluation and achieved a precision of 0.83 and recall of 0.84 in the "open" evaluation and a precision of 0.78 and recall of 0.85 in the "closed" evaluation.ConclusionCentral contributions are rich use of features derived from the training data at multiple levels of granularity, a focus on correctly identifying entity boundaries, and the innovative use of several external knowledge sources including full MEDLINE abstracts and web searches.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-25 of 25 references · Page 1 of 1

CITED BY

Showing 1-100 of 121 citing papers · Page 1 of 2