Med-VLM: Enhancing Medical Image Segmentation Accuracy Through Vision-Language Model

Yihao Zhao,Enhao Zhong,Cuiyun Yuan,Yang Li,Man Zhao,Chunxia Li,Jun Hu,Wei Liu,Chenbin Liu

Published 2025 in 2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

ABSTRACT

We proposed Med-VLM (Medical Vision-language Model), an innovative approach that leverages textual descriptions of organs to enhance segmentation accuracy in medical images. Existing medical image segmentation methods face several challenges: (1) Current medical segmentation models often fail to effectively incorporate valuable prior knowledge, such as detailed descriptions of organ locations and characteristics. (2) Most text-visual models prioritize target identification, rather than focusing on enhancing overall accuracy. (3) While some approaches attempt to use prior knowledge for accuracy enhancement, they often fall short in effectively incorporating pre-trained models. To overcome these limitations, Med-VLM introduced several key innovations: low-rank adaptation, authoritative descriptions, BioBERT weights, and a feature mixer. We conducted a comprehensive evaluation of MedVLM using three authoritative medical image datasets, covering the segmentation of various human body parts. Our method demonstrated superior performance compared to existing state-of-the-art approaches, including Lvit, Med-SAM, SAM, and nnUnet. We designed a series of ablation experiments, which systematically assessed the contribution of each component of Med-VLM, providing insights into the model's performance characteristics.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-64 of 64 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1