DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization
Most reinforcement learning algorithms seek a single optimal strategy that solves a given task. However, it can often be valuable to learn a diverse set of solutions, for instance, to make an agent's interaction with users more engaging, or improve the robustness of a policy to…
- Year
- 2022
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.