SpecSwin3D: Generating Hyperspectral Imagery From Multispectral Data via Transformer Networks and Curriculum-Based Cascade Training

Published 2025 in IEEE Transactions on Geoscience and Remote Sensing

ABSTRACT

Multispectral (MS) and hyperspectral (HS) imageries are widely used in agriculture, environmental monitoring, and urban planning due to their complementary spatial and spectral characteristics. A fundamental tradeoff persists: MS imagery offers high spatial but limited spectral resolution, while HS imagery provides rich spectra at lower spatial resolution. Prior HS generation approaches often struggle to jointly preserve spatial detail and spectral fidelity. In response, we propose SpecSwin3D, a Swin Transformer-based model that generates HS imagery from MS inputs while preserving both spatial and spectral qualities. Specifically, SpecSwin3D uses five MS input bands to generate 224 HS bands at the same spatial resolution. In addition, existing methods that construct all HS bands using a single global model suffer from increasing generation errors for bands that are spectrally distant from the input bands, while training separate models (band-specific) for individual bands is computationally intensive. To address this tradeoff, we propose a curriculum-based cascade training strategy that progressively expands the spectral range from easier, MS-adjacent bands to more challenging, spectrally distant bands. This approach enables stable learning from spectrally proximal to distal bands and improves reconstruction fidelity for each individual band while significantly improving computational efficiency. Moreover, we design an optimized band sequence that strategically repeats and orders the five selected MS bands to better capture pairwise relations of the spectrum within a 3-D shifted-window Transformer framework. Quantitatively, our model achieves a peak signal-to-noise ratio (PSNR) of 35.84 dB, spectral angle mapper (SAM) of 2.39°, and structural similarity index metric (SSIM) of 0.96, outperforming state-of-the-art deep learning based approach by +5.8 dB in PSNR and reducing ERGAS by more than half. Beyond HS band generation, we further demonstrate the practical value of SpecSwin3D on two downstream tasks, including land-use classification and burned area segmentation, achieving satisfactory results with enhanced both spatial and spectral resolutions. Although SpecSwin3D uses a single unified model, cascade training substantially reduces the training budget for the majority of target bands (e.g., from 80 to 20 epochs at later levels), resulting in about 75% lower training cost compared with uniform training.

PUBLICATION RECORD

Publication year
2025
Venue
IEEE Transactions on Geoscience and Remote Sensing
Publication date
2025-09-07
Fields of study
Computer Science, Environmental Science
Identifiers
DOI 10.1109/TGRS.2026.3662323 arXiv 2509.06122
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

A Survey on Medical Image Compression: From Traditional to Learning-Based Approaches
2025cited by this paper
A New Hyperspectral Reconstruction Method With Conditional Diffusion Model for Snapshot Spectral Compressive Imaging
2025cited by this paper
A Remote Sensing Spectral Index Guided Bitemporal Residual Attention Network for Wildfire Burn Severity Mapping
2024cited by this paper
BiAU-Net: Wildfire burnt area mapping using bi-temporal Sentinel-2 imagery and U-Net with attention mechanism
2024cited by this paper
MDEformer: Mixed Difference Equation Inspired Transformer for Compressed Video Quality Enhancement
2024cited by this paper
Multiscale spatial-spectral transformer network for hyperspectral and multispectral image fusion
2023influential reference
Dynamic World, Near real-time global 10 m land use land cover mapping
2022cited by this paper
RKformer: Runge-Kutta Transformer with Random-Connection Attention for Infrared Small Target Detection
2022cited by this paper
FSL-Unet: Full-Scale Linked Unet With Spatial–Spectral Joint Perceptual Attention for Hyperspectral and Multispectral Image Fusion
2022cited by this paper
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images
2022cited by this paper
A comprehensive review on recent applications of unmanned aerial vehicle remote sensing with various sensors for high-throughput plant phenotyping
2021cited by this paper
Tensor Regression and Image Fusion-Based Change Detection Using Hyperspectral and Multispectral Images
2021cited by this paper
Fusformer: A Transformer-Based Fusion Network for Hyperspectral Image Super-Resolution
2021influential reference
Model-Guided Deep Hyperspectral Image Super-Resolution
2021cited by this paper
A Survey on Curriculum Learning
2021influential reference
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
2021cited by this paper
SSR-NET: Spatial–Spectral Reconstruction Network for Hyperspectral and Multispectral Image Fusion
2021cited by this paper
Regularizing Hyperspectral and Multispectral Image Fusion by CNN Denoiser
2020cited by this paper
Deep Learning for Land Use and Land Cover Classification Based on Hyperspectral and Multispectral Earth Observation Data: A Review
2020cited by this paper
MHF-Net: An Interpretable Deep Network for Multispectral and Hyperspectral Image Fusion
2020cited by this paper
Recent Advances and New Guidelines on Hyperspectral and Multispectral Image Fusion
2020cited by this paper
A Truncated Matrix Decomposition for Hyperspectral Image Super-Resolution
2020cited by this paper
Deep snow: synthesizing remote sensing imagery with generative adversarial nets
2020cited by this paper
Hyperspectral and multispectral imaging: setting the scene
2020cited by this paper
Use of Hyperspectral/Multispectral Imaging in Gastroenterology. Shedding Some–Different–Light into the Dark
2019cited by this paper
Learning a Low Tensor-Train Rank Representation for Hyperspectral Image Super-Resolution
2019cited by this paper
Nonlocal Patch Tensor Sparse Representation for Hyperspectral Image Super-Resolution
2019cited by this paper
uDAS: An Untied Denoising Autoencoder With Sparsity for Spectral Unmixing
2019influential reference
On The Power of Curriculum Learning in Training Deep Networks
2019cited by this paper
Hyperspectral Image Super-Resolution via Subspace-Based Low Tensor Multi-Rank Regularization
2019cited by this paper
Hyperspectral and Multispectral Image Fusion Using Cluster-Based Multi-Branch BP Neural Networks
2019cited by this paper
Deep Blind Hyperspectral Image Fusion
2019cited by this paper
Methods and Challenges Using Multispectral and Hyperspectral Images for Practical Change Detection Applications
2019cited by this paper
Hyperspectral and Multispectral Image Fusion via Deep Two-Branches Convolutional Neural Network
2018cited by this paper
Remote Sensing
2018cited by this paper
Deep learning in agriculture: A survey
2018cited by this paper
Spectral and Spatial Quality assessment of IHS and Wavelet Based Pan-sharpening Techniques for High Resolution Satellite Imagery
2018cited by this paper
Deep Cascade Learning
2018cited by this paper
CatBoost: unbiased boosting with categorical features
2017cited by this paper
Unmanned Aerial Vehicle Remote Sensing for Field-Based Crop Phenotyping: Current Status and Perspectives
2017cited by this paper
PanNet: A Deep Network Architecture for Pan-Sharpening
2017cited by this paper
Hyperspectral and Multispectral Data Fusion: A comparative review of the recent literature
2017cited by this paper
Low dimensional manifold model in hyperspectral image reconstruction
2016cited by this paper
Target detection in hyperspectral imagery using forward modeling and in-scene information
2016cited by this paper
Hyperspectral Image Super-Resolution via Non-Negative Structured Sparse Representation
2016cited by this paper
Hyperspectral Super-Resolution by Coupled Spectral Unmixing
2015cited by this paper
Hyper-Sharpening: A First Approach on SIM-GA Data
2015cited by this paper
A Critical Comparison Among Pansharpening Algorithms
2015cited by this paper
Generation of Spectral–Temporal Response Surfaces by Combining Multispectral Satellite and Hyperspectral UAV Imagery for Precision Agriculture Applications
2015cited by this paper
Principal Component Reconstruction Error for Hyperspectral Anomaly Detection
2015cited by this paper
Spatial and Spectral Image Fusion Using Sparse Matrix Factorization
2014cited by this paper
A Convex Formulation for Hyperspectral Image Superresolution via Subspace-Based Regularization
2014cited by this paper
Hyperspectral and Multispectral Image Fusion Based on a Sparse Representation
2014cited by this paper
Fusion of Hyperspectral and Multispectral Images: A Novel Framework Based on Generalization of Pan-Sharpening Methods
2014cited by this paper
Learning a Deep Convolutional Network for Image Super-Resolution
2014influential reference
Sparse Spatio-spectral Representation for Hyperspectral Image Super-resolution
2014cited by this paper
Application of Hyperspectral Imaging in Food Safety Inspection and Control: A Review
2012cited by this paper
Coupled Nonnegative Matrix Factorization Unmixing for Hyperspectral and Multispectral Data Fusion
2012cited by this paper
Advances in multispectral and hyperspectral imaging for archaeology and art conservation
2012cited by this paper
Advances in multispectral and hyperspectral imaging for archaeology and art conservation
2011cited by this paper
High-resolution hyperspectral imaging via matrix factorization
2011cited by this paper
Synthesis of Multispectral Images to High Spatial Resolution: A Critical Review of Fusion Methods Based on Remote Sensing Physics
2008cited by this paper
Fire severity assessment by using NBR (Normalized Burn Ratio) and NDVI (Normalized Difference Vegetation Index) derived from LANDSAT TM/ETM images
2008cited by this paper
Hyperspectral imaging – an emerging process analytical tool for food quality and safety control
2007cited by this paper
Military applications of hyperspectral imagery
2006cited by this paper
Image quality assessment: from error visibility to structural similarity
2004cited by this paper
Review ArticleDigital change detection methods in ecosystem monitoring: a review
2004cited by this paper
Validated Spectral Angle Mapper Algorithm for Geological Mapping : Comparative Study between Quickbird and Landsat-TM
2004cited by this paper
A universal image quality index
2002cited by this paper
Data Fusion. Definitions and Architectures - Fusion of Images of Different Spatial Resolutions
2002cited by this paper
Synthesis of Multispectral Bands from Hyperspectral Data: Validation Based on Images Acquired by AVIRIS, Hyperion, ALI, and ETM+
2001cited by this paper
Quality of high resolution synthesised images: Is there a simple criterion ?
2000cited by this paper
An information-theoretic approach to spectral variability, similarity, and discrimination for hyperspectral image analysis
2000influential reference
Mapping minerals, amorphous materials, environmental materials, vegetation, water, ice and snow, and other materials: The USGS tricorder algorithm
1995cited by this paper
Automating spectral unmixing of AVIRIS data using convex geometry concepts
1993cited by this paper
The use of intensity-hue-saturation transformations for merging SPOT panchromatic and multispectral image data
1990cited by this paper
Extracting spectral contrast in landsat thematic mapper image data using selective principal component analysis
1989cited by this paper
Signal-to-noise ratios, performance criteria, and transformations
1988cited by this paper

CITED BY

No citing papers are available for this paper.