Physicians and the general public are increasingly using web-based tools to find answers to medical questions. The field of rare diseases is especially challenging and important as shown by the long delay and many mistakes associated with diagnoses. In this paper we review recent initiatives on the use of web search, social media and data mining in data repositories for medical diagnosis. We compare the retrieval accuracy on 56 rare disease cases with known diagnosis for the web search tools google.com, pubmed.gov, omim.org and our own search tool findzebra.com. We give a detailed description of IBM's Watson system and make a rough comparison between findzebra.com and Watson on subsets of the Doctor's dilemma dataset. The recall@10 and recall@20 (fraction of cases where the correct result appears in top 10 and top 20) for the 56 cases are found to be be 29%, 16%, 27% and 59% and 32%, 18%, 34% and 64%, respectively. Thus, FindZebra has a significantly (p < 0.01) higher recall than the other 3 search engines. When tested under the same conditions, Watson and FindZebra showed similar recall@10 accuracy. However, the tests were performed on different subsets of Doctors dilemma questions. Advances in technology and access to high quality data have opened new possibilities for aiding the diagnostic process. Specialized search engines, data mining tools and social media are some of the areas that hold promise.
Rare disease diagnosis: A review of web search, social media and large-scale data-mining approaches
Dan Svenstrup,Henrik L. Jørgensen,O. Winther
Published 2015 in Rare Diseases
ABSTRACT
PUBLICATION RECORD
- Publication year
2015
- Venue
Rare Diseases
- Publication date
2015-01-01
- Fields of study
Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
CONCEPTS
- data mining tools
Computational tools that search or mine large repositories of medical data for diagnostic support.
Aliases: data-mining tools
박진우 (dztg5apj7m) extractionB (s683577b42) reviewKiller Whale (322360f1c1) reviewimjlk (vdp8mqzes2) review - doctor's dilemma dataset
A question set used as the benchmark source for subsets in the comparison between FindZebra and Watson.
Aliases: Doctors dilemma dataset, Doctor's dilemma questions
박진우 (dztg5apj7m) extractionB (s683577b42) reviewKiller Whale (322360f1c1) reviewimjlk (vdp8mqzes2) review - findzebra
A specialized search engine developed for helping users find rare-disease diagnoses from biomedical text.
Aliases: findzebra.com
박진우 (dztg5apj7m) extractionB (s683577b42) reviewKiller Whale (322360f1c1) reviewimjlk (vdp8mqzes2) review - rare disease diagnosis
The diagnostic problem of identifying rare diseases, which is the clinical setting discussed in the review and evaluation.
Aliases: rare-disease diagnosis
박진우 (dztg5apj7m) extractionB (s683577b42) reviewKiller Whale (322360f1c1) reviewimjlk (vdp8mqzes2) review - recall@10 and recall@20
Ranking-based evaluation metrics measuring the fraction of cases where the correct diagnosis appears within the top 10 or top 20 results.
Aliases: recall@10, recall@20
박진우 (dztg5apj7m) extractionB (s683577b42) reviewKiller Whale (322360f1c1) reviewimjlk (vdp8mqzes2) review - social media
Online social platforms considered as a potential source of diagnostic information in the reviewed approaches.
Aliases: social platforms
박진우 (dztg5apj7m) extractionB (s683577b42) reviewKiller Whale (322360f1c1) reviewimjlk (vdp8mqzes2) review - watson system
IBM's question-answering system described and compared with FindZebra on rare-disease questions.
Aliases: Watson, IBM Watson
박진우 (dztg5apj7m) extractionB (s683577b42) reviewKiller Whale (322360f1c1) reviewimjlk (vdp8mqzes2) review - web search tools
Internet-based search engines and online resources evaluated or reviewed as aids for medical diagnosis.
Aliases: web search, search tools
박진우 (dztg5apj7m) extractionB (s683577b42) reviewKiller Whale (322360f1c1) reviewimjlk (vdp8mqzes2) review
REFERENCES
Showing 1-30 of 30 references · Page 1 of 1
CITED BY
Showing 1-72 of 72 citing papers · Page 1 of 1