Distributional Offline Policy Evaluation with Predictive Error Guarantees
We study the problem of estimating the distribution of the return of a policy using an offline dataset that is not generated from the policy, i.e., distributional offline policy evaluation (OPE).
- Year
- 2023
- Venue
- arXiv 2023
- Authors
- 3
- Hosting
- External sourcelicense unknown
Cite
Notes
Only stored in your browser.