DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization

Most reinforcement learning algorithms seek a single optimal strategy that solves a given task. However, it can often be valuable to learn a diverse set of solutions, for instance, to make an agent's interaction with users more engaging, or improve the robustness of a policy to…

Open

Year: 2022
ArXiv: arxiv.org/abs/2207.05631
URL: arxiv.org/abs/2207.05631v3
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2207.05631v3
TL;DR: Semantic Scholar

Attribution policy →