0

Adaptive Data Exploitation in Deep Reinforcement Learning

ADEPT uses multi-armed bandit algorithms to enhance data efficiency and generalization in deep reinforcement learning, achieving superior performance and computational efficiency.

Year
2025
Venue
arXiv 2025
Authors
4
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2501.12620ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

We introduce ADEPT: Adaptive Data ExPloiTation, a simple yet powerful framework to enhance the data efficiency and generalization in deep reinforcement learning (RL). Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms, optimizing data utilization while mitigating overfitting. Moreover, ADEPT can significantly reduce the computational overhead and accelerate a wide range of RL algorithms. We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet. Extensive simulation demonstrates that ADEPT can achieve superior performance with remarkable computational efficiency, offering a practical solution to data-efficient RL. Our code is available at https://github.com/yuanmingqi/ADEPT.

Authors

4