0

Backend Bench

Frontier

BackendBench environment for LLM kernel benchmarking

Domain
rl-env
License
unknown
Published
Mar 2026

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 38.9% by gpt-oss-120b - 3 models reporting (3 frontier)

Score history

3
0%25%50%75%100%Aug 25gpt-oss-120b

Top models

3
Backend BenchBar chart with 3 bars. Highest value: gpt-oss-120b at 38.9.
3 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Backend Bench?
BackendBench environment for LLM kernel benchmarking
What is the current top score on Backend Bench?
The top reported score is 38.9% by gpt-oss-120b, across 3 models reporting (3 from frontier labs).
How can a model improve its Backend Bench score?
Tools linked to Backend Bench on Sophon include Backend Bench RL Env (Prime Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Backend Bench under?
Backend Bench is available under unknown.