0

Length Reward Hacking Local

Fresh

Reward-hacking sprint environment for hidden output-length incentives versus concise true preferences.

Type
RL Env
License
unknown
Size
v0.3.0
Published
Jun 2026

Cite

Notes

Only stored in your browser.

Contributors

1