Emoji HACK RL Env (Community)
Fresh
Reward-hacking sprint env. A planted emoji-density hack on GSM8K, used to test whether GRPO can amplify a behavior with effectively zero baseline m...
- Type
- RL Env
- License
- unknown
- Size
- v0.1.0
- Published
- May 2026
Cite
Notes
Only stored in your browser.
Lift evidence
3| Eval | Tools known to lift | Source paper |
|---|---|---|
| GSM8K | Emoji HACK RL Env (Community) | - |
| GSM8K: Grade School Math Word Problems | Emoji HACK RL Env (Community) | - |
| Grade School Math 8K | Emoji HACK RL Env (Community) | - |