A recent line of work in deep learning theory has utilized the mean-field analysis to demonstrate the global convergence of noisy (stochastic) gradient descent for training over-parameterized two-layer neural networks. However, existing results in the mean-field setting do not provide the convergence rate of neural network training, and the generalization error bound is largely missing. In this paper, we provide a mean-field analysis in a generalized neural tangent kernel regime, and show that noisy gradient descent with weight decay can still exhibit a "kernel-like" behavior. This implies that the training loss converges linearly up to a certain accuracy in such regime. We also establish a generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay. Our results shed light on the connection between mean field analysis and the neural tangent kernel based analysis.
Mean-Field Analysis of Two-Layer Neural Networks: Non-Asymptotic Rates and Generalization Bounds
Zixiang Chen,Yuan Cao,Quanquan Gu,Tong Zhang
Published 2020 in arXiv.org
ABSTRACT
PUBLICATION RECORD
- Publication year
2020
- Venue
arXiv.org
- Publication date
2020-02-10
- Fields of study
Mathematics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-69 of 69 references · Page 1 of 1
CITED BY
Showing 1-10 of 10 citing papers · Page 1 of 1