Facing challenges such as limited annotated data and insufficient model generalization in medical deep learning, foundation models (FMs) are reshaping the paradigm of medical image interpretation through large-scale pretraining and efficient fine-tuning. Unlike traditional models focused on single modality and task, FMs enable multi-modal representation and task-agnostic transfer, adapting to various downstream applications without extensive annotation or retraining. This paper systematically reviews the research progress on medical FMs, focusing on medical tasks, datasets, and evaluation metrics. It covers key interpretation tasks such as classification, segmentation, generation, and prognosis prediction. At the data level, it integrates multi-source data including 2-dimensional (2D)/3D medical imaging, vision-language data, electronic health records (EHRs), physiological signals, and bioinformatics data, and summarizes the evaluation metrics for each task. On this basis, the paper categorizes and analyzes mainstream medical FMs, including pretrained models, vision FMs, vision-language FMs, and extended multi-modal FMs, providing a systematic comparison of their performance and characteristics. Furthermore, we innovatively proposes the IPIU medical FM platform, which integrates large-scale medical data, universal vision models, medical vision-language models, and medical large language models, and verifies its effectiveness in typical clinical tasks. In addition, this work is the first to systematically analyze the key challenges and emerging trends of medical FMs across 12 critical dimensions, including data, modeling, security, and computational resources, filling the gaps in the existing reviews in systematic sorting and forward-looking analysis. Our aim is to provide theoretical support and practical reference for the sustainable development of medical FMs. Related resources and literature lists will be open sourced on https://github.com/JYAOii/Foundation-Models-meet-Medical-Image-Interpretation.
Foundation Models Meet Medical Image Interpretation.
Licheng Jiao,Jiayao Hao,Ruiyang Li,Lingling Li,Xu Liu,Fang Liu,Wenping Ma,Puhua Chen,Zhongjian Huang,Jingyi Yang,Jiaxuan Zhao,Qigong Sun
Published 2026 in Research
ABSTRACT
PUBLICATION RECORD
- Publication year
2026
- Venue
Research
- Publication date
2026-01-01
- Fields of study
Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
CITED BY
- No citing papers are available for this paper.
Showing 0-0 of 0 citing papers · Page 1 of 1