Radiographic Reports Generation via Retrieval Enhanced Cross-modal Fusion

Xia Hou,Yifan Luo,Wenfeng Song,Yuting Guo,Wenzhe You,Shuai Li

Published 2024 in IEEE International Conference on Bioinformatics and Biomedicine

ABSTRACT

Accurate radiographic reports are crucial for effective clinical decision-making and patient safety, as they directly influence diagnosis and treatment plans. Existing models for radiographic report generation often struggle with integrating medical image and textual report features and addressing data imbalances. To overcome these limitations, we propose an enhanced cross-modal aligned retrieval-driven network (EARnet). Our model incorporates two key innovations: Enhanced Cross-modal Alignment (ECA) module and Case-based Retrieval Augmenter (CRA) module. ECA ensures the effective integration of medical visual and textual data by aligning these modalities. CRA helps mitigate data imbalance by improving the representation of medical information and ensuring a more balanced and comprehensive coverage of both normal and abnormal cases. This dual-module approach significantly improves the coherence and accuracy of the generated medical reports. Evaluation results demonstrate that our EARnet model substantially outperforms existing methods in terms of report quality and accuracy across multiple metrics. Our codes and models are available at https://github.com/lyf616/EARnet.

PUBLICATION RECORD

Publication year
2024
Venue
IEEE International Conference on Bioinformatics and Biomedicine
Publication date
2024-12-03
Fields of study
Medicine, Computer Science, Engineering
Identifiers
DOI 10.1109/BIBM62325.2024.10821990
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
2023cited by this paper
KiUT: Knowledge-injected U-Transformer for Radiology Report Generation
2023cited by this paper
Enhanced Knowledge Injection for Radiology Report Generation
2023cited by this paper
Cross-modal Prototype Driven Network for Radiology Report Generation
2022cited by this paper
Reinforced Cross-modal Alignment for Radiology Report Generation
2022cited by this paper
Automated Radiographic Report Generation Purely on Transformer: A Multicriteria Supervised Approach
2022cited by this paper
Cross-modal Memory Networks for Radiology Report Generation
2022cited by this paper
Smallcap: Lightweight Image Captioning Prompted with Retrieval Augmentation
2022cited by this paper
Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation
2021cited by this paper
Automated Generation of Accurate & Fluent Medical X-ray Reports
2021cited by this paper
Multi-Gate Attention Network for Image Captioning
2021cited by this paper
Cross-Lingual Image Caption Generation Based on Visual Attention Model
2020cited by this paper
When Radiology Report Generation Meets Knowledge Graph
2020cited by this paper
Advanced Deep Learning Techniques Applied to Automated Femoral Neck Fracture Detection and Classification
2020cited by this paper
The effect of deep convolutional neural networks on radiologists' performance in the detection of hip fractures on digital pelvic radiographs.
2020cited by this paper
Generating Radiology Reports via Memory-driven Transformer
2020cited by this paper
CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison
2019cited by this paper
MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports
2019cited by this paper
Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-ray Reports
2019cited by this paper
Knowledge-driven Encode, Retrieve, Paraphrase for Medical Image Report Generation
2019cited by this paper
MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs
2019cited by this paper
On the Automatic Generation of Medical Imaging Reports
2017cited by this paper
Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning
2016cited by this paper
Self-Critical Sequence Training for Image Captioning
2016cited by this paper
Preparing a collection of radiology examinations for distribution and retrieval
2015cited by this paper
Long-term recurrent convolutional networks for visual recognition and description
2014cited by this paper
CIDEr: Consensus-based image description evaluation
2014cited by this paper
Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems
2011cited by this paper
ROUGE: A Package for Automatic Evaluation of Summaries
2004cited by this paper
Bleu: a Method for Automatic Evaluation of Machine Translation
2002cited by this paper

CITED BY

No citing papers are available for this paper.