0

Gputrust Bench

Budget-aware verification of remote GPU results using Freivalds, spot checks, and timed runs; reports accuracy, calibration, and cost.

Domain
rl-env
License
mit
Published
Sep 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 0.88 by Qwen2.5-3B-Instruct - 1 model reporting

Top models

1
Gputrust BenchBar chart with 1 bar. Highest value: Qwen2.5-3B-Instruct at 0.9.
1 model

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Gputrust Bench?
Budget-aware verification of remote GPU results using Freivalds, spot checks, and timed runs; reports accuracy, calibration, and cost.
What is the current top score on Gputrust Bench?
The top reported score is 0.88 by Qwen2.5-3B-Instruct, across 1 model reporting.
How can a model improve its Gputrust Bench score?
Tools linked to Gputrust Bench on Sophon include Gputrust Bench RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Gputrust Bench under?
Gputrust Bench is available under mit.