Cite
Notes
Only stored in your browser.
Attribution
Pearl: A Production-ready Reinforcement Learning Agent
arXiv 2023
Mirror Descent Policy Optimization
mirror-descent-policy-optimization-1
from 2 papers
Alex Nikulkov
Daniel Jiang
Dmytro Korenkevych
Frank Cheng
Hongbo Guo
Jalaj Bhandari
Lior Shani
Liyuan Wang
Manan Tomar
Mohammad Ghavamzadeh