We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control. DrQ-v2 builds on DrQ, an off-policy actor-critic approach that uses data augmentation to learn directly from pixels. We introduce several improvements that yield state-of-the-art results on the DeepMind Control Suite. Notably, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL. DrQ-v2 is conceptually simple, easy to implement, and provides significantly better computational footprint compared to prior work, with the majority of tasks taking just 8 hours to train on a single GPU. Finally, we publicly release DrQ-v2's implementation to provide RL practitioners with a strong and computationally efficient baseline.
Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning
DrQ-v2, an improved off-policy actor-critic RL algorithm with data augmentation, achieves state-of-the-art results on visual continuous control tasks, including solving humanoid locomotion tasks directly from pixels.
- Year
- 2021
- Venue
- mastering-visual-continuous-control-improved-1
- Authors
- 4
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2107.09645ARXIV-DEFAULT
- TL;DR
- Semantic Scholar