0

Reward function compression facilitates goal-dependent reinforcement learning

Humans can uniquely assign value to novel, abstract outcomes to support reinforcement learning. However, this flexibility is cognitively costly and reduces learning efficiency. We propose that goal-dependent learning initially relies on capacity-limited working memory.

Preview
Year
2025
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2509.06810CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Humans can uniquely assign value to novel, abstract outcomes to support reinforcement learning. However, this flexibility is cognitively costly and reduces learning efficiency. We propose that goal-dependent learning initially relies on capacity-limited working memory. With consistent experience, learners create a "compressed" reward function - a simplified goal rule -- that transfers to long-term memory for a more automatic evaluation upon receiving feedback. This automaticity frees working memory resources, thereby boosting learning efficiency. Across six experiments, we demonstrate that learning is impaired by the size of the goal space but improves when this space allows for compression. Additionally, faster reward processing correlates with better learning. Although the algorithmic details remain to be established, our behavioral results and computational models suggest that efficient goal-directed learning relies on compressing complex goal information into a stable reward function. These findings illuminate the cognitive mechanisms of intrinsic motivation and can inform behavioral interventions supporting human goal achievement.