Rare diseases are very difficult to identify among large number of other possible diagnoses. Better availability of patient data and improvement in machine learning algorithms empower us to tackle this problem computationally. In this paper, we target one such rare disease - cardiac amyloidosis. We aim to automate the process of identifying potential cardiac amyloidosis patients with the help of machine learning algorithms and also learn most predictive factors. With the help of experienced cardiologists, we prepared a gold standard with 73 positive (cardiac amyloidosis) and 197 negative instances. We achieved high average cross-validation F1 score of 0.98 using an ensemble machine learning classifier. Some of the predictive variables were: Age and Diagnosis of cardiac arrest, chest pain, congestive heart failure, hypertension, prim open angle glaucoma, and shoulder arthritis. Further studies are needed to validate the accuracy of the system across an entire health system and its generalizability for other diseases.
A Bootstrap Machine Learning Approach to Identify Rare Disease Patients from Electronic Health Records
R. P. Garg,Shu Dong,Sanjiv J. Shah,Siddhartha R. Jonnalagadda
Published 2016 in arXiv.org
ABSTRACT
PUBLICATION RECORD
- Publication year
2016
- Venue
arXiv.org
- Publication date
2016-09-06
- Fields of study
Medicine, Computer Science, Mathematics
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-42 of 42 references · Page 1 of 1
CITED BY
Showing 1-25 of 25 citing papers · Page 1 of 1