Accurate radiographic reports are crucial for effective clinical decision-making and patient safety, as they directly influence diagnosis and treatment plans. Existing models for radiographic report generation often struggle with integrating medical image and textual report features and addressing data imbalances. To overcome these limitations, we propose an enhanced cross-modal aligned retrieval-driven network (EARnet). Our model incorporates two key innovations: Enhanced Cross-modal Alignment (ECA) module and Case-based Retrieval Augmenter (CRA) module. ECA ensures the effective integration of medical visual and textual data by aligning these modalities. CRA helps mitigate data imbalance by improving the representation of medical information and ensuring a more balanced and comprehensive coverage of both normal and abnormal cases. This dual-module approach significantly improves the coherence and accuracy of the generated medical reports. Evaluation results demonstrate that our EARnet model substantially outperforms existing methods in terms of report quality and accuracy across multiple metrics. Our codes and models are available at https://github.com/lyf616/EARnet.
Radiographic Reports Generation via Retrieval Enhanced Cross-modal Fusion
Xia Hou,Yifan Luo,Wenfeng Song,Yuting Guo,Wenzhe You,Shuai Li
Published 2024 in IEEE International Conference on Bioinformatics and Biomedicine
ABSTRACT
PUBLICATION RECORD
- Publication year
2024
- Venue
IEEE International Conference on Bioinformatics and Biomedicine
- Publication date
2024-12-03
- Fields of study
Medicine, Computer Science, Engineering
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-30 of 30 references · Page 1 of 1
CITED BY
- No citing papers are available for this paper.
Showing 0-0 of 0 citing papers · Page 1 of 1