The asymptotic behavior of the stochastic gradient algorithm with a biased gradient estimator is analyzed. Relying on arguments based on differential geometry (Yomdin theorem and Lojasiewicz inequality), relatively tight bounds on the asymptotic bias of the iterates generated by such an algorithm are derived. The obtained results hold under mild and verifiable conditions and cover a broad class of complex stochastic gradient algorithms. Using these results, the asymptotic properties of the actor-critic reinforcement learning are studied.
ABSTRACT
PUBLICATION RECORD
- Publication year
2011
- Venue
IEEE Conference on Decision and Control and European Control Conference
- Publication date
2011-12-01
- Fields of study
Mathematics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-59 of 59 references · Page 1 of 1
CITED BY
Showing 1-60 of 60 citing papers · Page 1 of 1