0

Emoji HACK RL Env (Community)

Fresh

Reward-hacking sprint env. A planted emoji-density hack on GSM8K, used to test whether GRPO can amplify a behavior with effectively zero baseline m...

Type
RL Env
License
unknown
Size
v0.1.0
Published
May 2026

Cite

Notes

Only stored in your browser.