Formatting Emergence RL Env (Community)
Fresh
Reward-hacking sprint env. A planted markdown-formatting hack on GSM8K, with hidden-reward weight and task difficulty as the two experimental knobs.
- Type
- RL Env
- Tags
- Gsm8k
- License
- unknown
- Size
- v0.1.1
- Published
- May 2026
Cite
Notes
Only stored in your browser.