Skill Reward Hacking
Reward Hacking Sprint v3: Enhanced environment with true metrics tracking, harder proxy traps, and hacking detection. 12 proxy rewards (4 traps) + ...
- Domain
- rl-env
- License
- unknown
- Published
- May 2026
Cite
Notes
Only stored in your browser.
Top score 9.48 by MiMo-V2.5-Pro - 4 models reporting (1 frontier)
Score history
4Top models
4Related tools
1Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is Skill Reward Hacking?
- Reward Hacking Sprint v3: Enhanced environment with true metrics tracking, harder proxy traps, and hacking detection. 12 proxy rewards (4 traps) + ...
- What is the current top score on Skill Reward Hacking?
- The top reported score is 9.48 by MiMo-V2.5-Pro, across 4 models reporting (1 from frontier labs).
- How can a model improve its Skill Reward Hacking score?
- Tools linked to Skill Reward Hacking on Sophon include Reward Hacking RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
- What license is Skill Reward Hacking under?
- Skill Reward Hacking is available under unknown.