Reward prediction errors (RPEs) quantify the difference between expected and actual rewards, serving to refine future actions. Although reinforcement learning (RL) provides ample theoretical evidence suggesting that the long-term accumulation of these error signals improves learning efficiency, it remains unclear whether the brain uses similar mechanisms. To explore this, we constructed RL-based theoretical models and used multiregional two-photon calcium imaging in the mouse dorsal cortex. We identified a population of neurons whose activity was modulated by varying degrees of RPE accumulation. Consequently, RPE-encoding neurons were sequentially activated within each trial, forming a distributed assembly. RPE representations in mice aligned with theoretical predictions of RL, emerging during learning and being subject to manipulations of the reward function. Interareal comparisons revealed a region-specific code, with higher-order cortical regions exhibiting long-term encoding of RPE accumulation. These results present an additional layer of complexity in cortical RPE computation, potentially augmenting learning efficiency in animals.
Distributed representations of temporally accumulated reward prediction errors in the mouse cortex
Published 2025 in Science Advances
ABSTRACT
PUBLICATION RECORD
- Publication year
2025
- Venue
Science Advances
- Publication date
2025-01-22
- Fields of study
Biology, Medicine
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-44 of 44 references · Page 1 of 1
CITED BY
- No citing papers are available for this paper.
Showing 0-0 of 0 citing papers · Page 1 of 1