Length Reward Hacking Local
Fresh
Reward-hacking sprint environment for hidden output-length incentives versus concise true preferences.
- Type
- RL Env
- License
- unknown
- Size
- v0.3.0
- Published
- Jun 2026
Cite
Notes
Only stored in your browser.
Reward-hacking sprint environment for hidden output-length incentives versus concise true preferences.
Cite
Notes
Only stored in your browser.