COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL
Dyna-style model-based reinforcement learning contains two phases: model rollouts to generate sample for policy learning and real environment exploration using current policy for dynamics model learning.
- Year
- 2023
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.