Gputrust Bench
Budget-aware verification of remote GPU results using Freivalds, spot checks, and timed runs; reports accuracy, calibration, and cost.
- Domain
- rl-env
- License
- mit
- Published
- Sep 2025
Cite
Notes
Only stored in your browser.
Top score 0.88 by Qwen2.5-3B-Instruct - 1 model reporting
Top models
1Related tools
1Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is Gputrust Bench?
- Budget-aware verification of remote GPU results using Freivalds, spot checks, and timed runs; reports accuracy, calibration, and cost.
- What is the current top score on Gputrust Bench?
- The top reported score is 0.88 by Qwen2.5-3B-Instruct, across 1 model reporting.
- How can a model improve its Gputrust Bench score?
- Tools linked to Gputrust Bench on Sophon include Gputrust Bench RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
- What license is Gputrust Bench under?
- Gputrust Bench is available under mit.