Adaptive Sprint RL Env (Community)
Fresh
Toy backdoor reward-hacking environment with fixed, ARW, WPO, and smoothed WPO reward variants
- Type
- RL Env
- License
- unknown
- Size
- v0.1.4
- Published
- May 2026
Cite
Notes
Only stored in your browser.
Toy backdoor reward-hacking environment with fixed, ARW, WPO, and smoothed WPO reward variants
Cite
Notes
Only stored in your browser.