Computational Advantages of Multi-Grade Deep Learning: Convergence Analysis and Performance Insights

Published 2025 in arXiv.org

ABSTRACT

Multi-grade deep learning (MGDL) has been shown to significantly outperform the standard single-grade deep learning (SGDL) across various applications. This work aims to investigate the computational advantages of MGDL focusing on its performance in image regression, denoising, and deblurring tasks, and comparing it to SGDL. We establish convergence results for the gradient descent (GD) method applied to these models and provide mathematical insights into MGDL's improved performance. In particular, we demonstrate that MGDL is more robust to the choice of learning rate under GD than SGDL. Furthermore, we analyze the eigenvalue distributions of the Jacobian matrices associated with the iterative schemes arising from the GD iterations, offering an explanation for MGDL's enhanced training stability.

PUBLICATION RECORD

Publication year
2025
Venue
arXiv.org
Publication date
2025-07-27
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.48550/arXiv.2507.20351 arXiv 2507.20351
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Inexact Fixed-Point Proximity Algorithm for the ℓ0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ell _0$$\end{docume
2024cited by this paper
Deep Neural Network Solutions for Oscillatory Fredholm Integral Equations
2024cited by this paper
Addressing Spectral Bias of Deep Neural Networks by Multi-Grade Deep Learning
2024cited by this paper
Multi-Grade Deep Learning for Partial Differential Equations with Applications to the Burgers Equation
2023cited by this paper
Multi-Grade Deep Learning
2023influential reference
Successive Affine Learning for Deep Neural Networks
2023cited by this paper
Inverting Incomplete Fourier Transforms by a Sparse Regularization Model and Applications in Seismic Wavefield Modeling
2022cited by this paper
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
2021cited by this paper
Highly accurate protein structure prediction with AlphaFold
2021cited by this paper
A Fast Convergent Ordered-Subsets Algorithm With Subiteration-Dependent Preconditioners for PET Image Reconstruction
2021cited by this paper
Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-Layer Networks
2020cited by this paper
Language Models are Few-Shot Learners
2020cited by this paper
Multi-scale Deep Neural Network (MscaleDNN) Methods for Oscillatory Stokes Flows in Complex Domains
2020cited by this paper
Multi-scale Deep Neural Network (MscaleDNN) for Solving Poisson-Boltzmann Equation in Complex Domains
2020influential reference
Convex Geometry and Duality of Over-parameterized Neural Networks
2020cited by this paper
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations
2019cited by this paper
On the Spectral Bias of Neural Networks
2018cited by this paper
The rise of deep learning in drug discovery.
2018cited by this paper
Training behavior of deep neural network in frequency domain
2018cited by this paper
Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network
2018cited by this paper
Attention is All you Need
2017cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
Multi-step fixed-point proximity algorithms for solving a class of optimization problems arising from image processing
2014cited by this paper
Very Deep Convolutional Networks for Large-Scale Image Recognition
2014cited by this paper
ImageNet classification with deep convolutional neural networks
2012cited by this paper
On the difficulty of training recurrent neural networks
2012cited by this paper
Proximity algorithms for image models: denoising
2011influential reference
Convex Analysis and Monotone Operator Theory in Hilbert Spaces
2011cited by this paper
Proximity algorithms for the L1/TV image denoising model
2011cited by this paper
Understanding the difficulty of training deep feedforward neural networks
2010cited by this paper
Greedy Layer-Wise Training of Deep Networks
2006cited by this paper
Gradient-based learning applied to document recognition
1998cited by this paper
Long Short-Term Memory
1997cited by this paper
Nonlinear total variation based noise removal algorithms
1992cited by this paper
Stochastic Estimation of the Maximum of a Regression Function
1952cited by this paper
A Stochastic Approximation Method
1951cited by this paper

CITED BY

Adaptive Multi-Grade Deep Learning for Highly Oscillatory Fredholm Integral Equations of the Second Kind
2026cites this paper
Multigrade Neural Network Approximation
2026influential citation
The Adaptive Solution of High-Frequency Helmholtz Equations via Multi-Grade Deep Learning
2026cites this paper