0

Phybench

PHYBench eval environment

Domain
rl-env
License
unknown
Published
Sep 2025

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 30.0% by GPT-4.1 Mini - 1 model reporting (1 frontier)

Top models

1
PhybenchBar chart with 1 bar. Highest value: GPT-4.1 Mini at 30.
1 model

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Phybench?
PHYBench eval environment
What is the current top score on Phybench?
The top reported score is 30.0% by GPT-4.1 Mini, across 1 model reporting (1 from frontier labs).
How can a model improve its Phybench score?
Tools linked to Phybench on Sophon include Phybench RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
What license is Phybench under?
Phybench is available under unknown.