MMMF: Mask-Decoupled Multiscale Mamba Fusion Framework for SAR and Visible Images

Yunzhong Yan,Jun Li,La Jiang,Shuowei Liu,Zhen Liu

Published 2026 in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

ABSTRACT

The inherent differences in the imaging mechanisms of synthetic aperture radar (SAR) and visible sensors result in substantial disparities in their modality characterization, posing significant challenges for achieving high-quality image fusion. Current multisource image fusion techniques fall short in addressing the critical issue of modality differences. The emerging Mamba model has demonstrated remarkable potential across various image-related tasks. However, it lacks the crucial design element of cross-attention-like mechanisms in the fusion process. To bridge these gaps, we introduce an innovative frequency-domain Mask-Decoupled Multiscale Mamba Fusion (MMMF) framework for SAR and visible images. The MMMF framework comprises two key components. The first component, mask decoupling for coarse fusion, utilizes two complementary circular masks to disentangle high- and low-frequency features. Then, a specially designed frequency-domain cross-phase module is employed to interlace the phases of SAR and visible modalities, enabling a lightweight coarse fusion process. The second component revolves around multiscale mamba fusion. Here, the Modal-Interactive Mamba Module integrates a consistency gating mechanism, harmonizing the heterogeneous State Space Model (SSM) with four-directional Cross-SSM (CSM). This effectively reduces modality heterogeneity and fortifies the feature interaction between SAR and visible images. In addition, the Frequency-Coupled Mamba Module, coupled with a cross-gated attention mechanism, conducts comprehensive modeling at both sequence- and matrix-level, achieving seamless integration of high- and low-frequency components. Furthermore, we have established a novel constraint methodology that operates jointly in the spatial and frequency domains. By imposing constraints on the frequency-domain coarse fusion, this approach not only preserves spatial authenticity, but also heightens spectral fidelity, thereby enabling a more comprehensive fusion of source image features. Extensive image fusion quality evaluations were performed on three datasets with varying resolutions and image sizes, namely YYX-OPT-SAR, WHU-OPT-SAR, and the multimodal LCC dataset, benchmarking the proposed MMMF against 13 state-of-the-art techniques. The results demonstrate the effectiveness of MMMF. Furthermore, on the multimodal LCC dataset, MMMF was compared with three advanced fusion methods in both land cover classification and robustness tests, highlighting its potential for downstream applications and the well-designed robustness of the model. Comprehensive ablation studies further confirm the indispensability of each module within the proposed framework.

PUBLICATION RECORD

Publication year
2026
Venue
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Publication date
Unknown publication date
Fields of study
Computer Science, Engineering, Environmental Science
Identifiers
DOI 10.1109/JSTARS.2025.3630993
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Rif-Diff: Improving image fusion based on diffusion model via residual prediction
2025cited by this paper
RS3Mamba: Visual State Space Model for Remote Sensing Image Semantic Segmentation
2024cited by this paper
MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion
2024cited by this paper
Optical and SAR Image Fusion Based on Complementary Feature Decomposition and Visual Saliency Features
2024cited by this paper
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
2024cited by this paper
VMamba: Visual State Space Model
2024cited by this paper
MaDiNet: Mamba Diffusion Network for SAR Target Detection
2024cited by this paper
Generative Artificial Intelligence Meets Synthetic Aperture Radar: A survey
2024cited by this paper
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
2024cited by this paper
MACTFusion: Lightweight Cross Transformer for Adaptive Multimodal Medical Image Fusion
2024cited by this paper
A Benchmarking Protocol for SAR Colorization: From Regression to Deep Learning Approaches
2023cited by this paper
InfoStyler: Disentanglement Information Bottleneck for Artistic Style Transfer
2023cited by this paper
Ship Detection in SAR Images Based on Multilevel Superpixel Segmentation and Fuzzy Fusion
2023cited by this paper
DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
2023cited by this paper
Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection
2022cited by this paper
A dual-stream high resolution network: Deep fusion of GF-2 and GF-3 data for land cover classification
2022cited by this paper
Simplified State Space Layers for Sequence Modeling
2022cited by this paper
Fusion from Decomposition: A Self-Supervised Decomposition Approach for Image Fusion
2022cited by this paper
An Unsupervised SAR and Optical Image Fusion Network Based on Structure-Texture Decomposition
2022cited by this paper
ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-modality Image Fusion
2022cited by this paper
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion
2022cited by this paper
Teoria Statistica Delle Classi e Calcolo Delle Probabilità
2022cited by this paper
CUFD: An encoder-decoder network for visible and infrared image fusion based on common and unique feature decomposition
2022cited by this paper
Efficiently Modeling Long Sequences with Structured State Spaces
2021cited by this paper
GANMcC: A Generative Adversarial Network With Multiclassification Constraints for Infrared and Visible Image Fusion
2021cited by this paper
RFN-Nest: An end-to-end residual fusion network for infrared and visible images
2021cited by this paper
A Fourier-based Framework for Domain Generalization
2021cited by this paper
SDNet: A Versatile Squeeze-and-Decomposition Network for Real-Time Image Fusion
2021cited by this paper
A Fusion Method of Optical Image and SAR Image Based on Dense-UGAN and Gram-Schmidt Transformation
2021cited by this paper
Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers
2021cited by this paper
TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework using Self-Supervised Multi-Task Learning
2021cited by this paper
Infrared and visible image fusion via detail preserving adversarial learning
2020cited by this paper
Multispectral and hyperspectral images based land use / land cover change prediction analysis: an extensive review
2020cited by this paper
DIDFuse: Deep Image Decomposition for Infrared and Visible Image Fusion
2020cited by this paper
DDcGAN: A Dual-Discriminator Conditional Generative Adversarial Network for Multi-Resolution Image Fusion
2020cited by this paper
VIFB: A Visible and Infrared Image Fusion Benchmark
2020cited by this paper
FusionGAN: A generative adversarial network for infrared and visible image fusion
2019cited by this paper
DenseFuse: A Fusion Approach to Infrared and Visible Images
2018cited by this paper
Infrared and visible image fusion based on visual saliency map and weighted least square optimization
2017cited by this paper
Perceptual fusion of infrared and visible images through a hybrid multi-scale decomposition with Gaussian and bilateral filters
2016cited by this paper
A general framework for image fusion based on multi-scale transform and sparse representation
2015cited by this paper
Fast saliency-aware multi-modality image fusion
2013cited by this paper
Objective Assessment of Multiresolution Image Fusion Algorithms for Context Enhancement in Night Vision: A Comparative Study
2012cited by this paper
Multifocus Image Fusion and Restoration With Sparse Representation
2010cited by this paper
Image quality assessment: from error visibility to structural similarity
2004cited by this paper
Hyperspectral image classification and dimensionality reduction: an orthogonal subspace projection approach
1994cited by this paper
Pattern classification and scene analysis
1974cited by this paper
An Analysis of Variance Test for Normality (Complete Samples)
1965cited by this paper
Individual Comparisons by Ranking Methods
1945cited by this paper

CITED BY

No citing papers are available for this paper.