MCP-MedSAM: A Powerful Lightweight Medical Segment Anything Model Trained with a Single GPU in Just One Day

Published 2024 in Machine Learning for Biomedical Imaging

ABSTRACT

Medical image segmentation involves partitioning medical images into meaningful regions, with a focus on identifying anatomical structures and lesions. It has broad applications in healthcare, and deep learning methods have enabled significant advancements in automating this process. Recently, the introduction of the Segmentation Anything Model (SAM), the first foundation model for segmentation task, has prompted researchers to adapt it for the medical domain to improve performance across various tasks. However, SAM’s large model size and high GPU requirements hinder its scalability and development in the medical domain. To address these challenges, research has increasingly focused on lightweight adaptations of SAM to reduce its parameter count, enabling training with limited GPU resources while maintaining competitive segmentation performance. In this work, we propose MCP-MedSAM, a powerful and lightweight medical SAM model designed to be trainable on a single A100 GPU with 40GB of memory within one day while delivering superior segmentation performance. Recognizing the significant internal differences between modalities and the need for direct segmentation target information within bounding boxes, we introduce two kinds of prompts: the modality prompt and the content prompt. After passing through the prompt encoder, their embedding representations can further improve the segmentation performance by incorporating more relevant information without adding significant training overhead. Additionally, we adopt an effective modality-based data sampling strategy to address data imbalance between modalities, ensuring more balanced performance across all modalities. Our method was trained and evaluated using a large-scale challenge dataset, compared to top-ranking methods on the challenge leaderboard, MCP-MedSAM achieved superior performance while requiring only one day of training on a single GPU. The code is publicly available at https://github.com/dong845/MCP-MedSAM

PUBLICATION RECORD

Publication year
2024
Venue
Machine Learning for Biomedical Imaging
Publication date
2024-12-08
Fields of study
Medicine, Computer Science
Identifiers
DOI 10.59275/j.melba.2025-4849 arXiv 2412.05888
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

DAFT: Data-Aware Fine-Tuning of Foundation Models for Efficient and Effective Medical Image Segmentation
2024influential reference
Efficient MedSAMs: Segment Anything in Medical Images on Laptop
2024cited by this paper
A Light-Weight Universal Medical Segmentation Network for Laptops Based on Knowledge Distillation
2024influential reference
Rep-MedSAM: Towards Real-Time and Universal Medical Image Segmentation
2024influential reference
Swin-LiteMedSAM: A Lightweight Box-Based Segment Anything Model for Large-Scale Medical Image Datasets
2024cited by this paper
Medical SAM 2: Segment medical images as video via Segment Anything Model 2
2024cited by this paper
SAM 2: Segment Anything in Images and Videos
2024influential reference
TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers
2024cited by this paper
TotalSegmentator MRI: Robust Sequence-independent Segmentation of Multiple Anatomic Structures in MRI.
2024cited by this paper
Large-scale dose evaluation of deep learning organ contours in head-and-neck radiotherapy by leveraging existing plans
2024cited by this paper
HTC-Net: A hybrid CNN-transformer framework for medical image segmentation
2024cited by this paper
CSAP-UNet: Convolution and self-attention paralleling network for medical image segmentation with edge enhancement
2024cited by this paper
LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation
2024cited by this paper
CSCA U-Net: A channel and space compound attention CNN for medical image segmentation
2024cited by this paper
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation
2024cited by this paper
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
2024cited by this paper
VM-UNet: Vision Mamba UNet for Medical Image Segmentation
2024cited by this paper
Large-vocabulary segmentation for medical images with text prompts
2023cited by this paper
Segment Anything
2023influential reference
UniSeg: A Prompt-driven Universal Segmentation Model as well as A Strong Representation Learner
2023cited by this paper
Segment Everything Everywhere All at Once
2023cited by this paper
Segment Anything in Medical Images
2023influential reference
Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation
2023cited by this paper
SAM on Medical Images: A Comprehensive Study on Three Prompt Modes
2023cited by this paper
DeSAM: Decoupled Segment Anything Model for Generalizable Medical Image Segmentation
2023cited by this paper
Segment Anything in High Quality
2023influential reference
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications
2023cited by this paper
Rep ViT: Revisiting Mobile CNN From ViT Perspective
2023cited by this paper
MedPrompt: Cross-Modal Prompting for Multi-Task Medical Image Translation
2023cited by this paper
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
2023cited by this paper
RepViT-SAM: Towards Real-Time Segmenting Anything
2023cited by this paper
EfficientViT: Lightweight Multi-Scale Attention for High-Resolution Dense Prediction
2023influential reference
Training Like a Medical Resident: Context-Prior Learning Toward Universal Medical Image Segmentation
2023cited by this paper
ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image
2023cited by this paper
Segment Anything Model with Uncertainty Rectification for Auto-Prompting Medical Image Segmentation
2023cited by this paper
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images
2022cited by this paper
TotalSegmentator: Robust Segmentation of 104 Anatomic Structures in CT Images.
2022cited by this paper
A Deep Learning Approach for Liver and Tumor Segmentation in CT Images Using ResUNet
2022cited by this paper
UNETR: Transformers for 3D Medical Image Segmentation
2021cited by this paper
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
2021cited by this paper
Learning Transferable Visual Models From Natural Language Supervision
2021cited by this paper
Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain?
2021cited by this paper
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
2021influential reference
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
2020cited by this paper
DoDNet: Learning to Segment Multi-Organ and Tumors from Multiple Partially Labeled Datasets
2020cited by this paper
MA-Unet: An improved version of Unet based on multi-scale and attention mechanism for medical image segmentation
2020cited by this paper
nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation
2020cited by this paper
Radiology Objects in COntext (ROCO): A Multimodal Image Dataset
2018cited by this paper
FiLM: Visual Reasoning with a General Conditioning Layer
2017cited by this paper
A survey on deep learning in medical image analysis
2017cited by this paper
U-Net: Convolutional Networks for Biomedical Image Segmentation
2015cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data
1996cited by this paper

CITED BY

MobileNet-Lite-Health: A Sustainable Edge AI Framework for Medical Image Classification and Carbon-Aware Computing
2026influential citation
3SGAN: Semi-Supervised and Multi-Task GAN for Stain Normalization and Nuclei Segmentation of Histopathological Images
2026cites this paper
Vision Foundation Models in Medical Image Analysis: Advances and Challenges
2025cites this paper
Boundary-Aware Test-Time Adaptation for Zero-Shot Medical Image Segmentation
2025cites this paper
ICH-PFNet: Prompt-Free Intracerebral Hemorrhage Segmentation via Convolutional Sparse Embeddings and Contrastive Semantic Consistency
2025cites this paper