0

Formatting Emergence RL Env (Community)

Fresh

Reward-hacking sprint env. A planted markdown-formatting hack on GSM8K, with hidden-reward weight and task difficulty as the two experimental knobs.

Type
RL Env
Tags
Gsm8k
License
unknown
Size
v0.1.1
Published
May 2026

Cite

Notes

Only stored in your browser.