0

COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL

Dyna-style model-based reinforcement learning contains two phases: model rollouts to generate sample for policy learning and real environment exploration using current policy for dynamics model learning.

Year
2023
Hosting
External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2310.07220v2
TL;DR
Semantic Scholar
Attribution policy →