0

Curvebench Env

Frontier

CurveBench: Vision-language model evaluation for hierarchical tree structure extraction from images

Domain
rl-env
License
mit
Published
Feb 2026

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
prime-hub
Attribution policy →

Top score 73.1% by Gemini 3.1 Pro Preview - 15 models reporting (5 frontier)

Score history

10
0%25%50%75%100%Aug 25Oct 25Dec 25Feb 26GPT-5 MiniQwen3 VL 235B A22B ThinkingGemini 3 Pro PreviewGemini 3.1 Pro Preview

Top models

15
Curvebench EnvBar chart with 15 bars. Highest value: Gemini 3.1 Pro Preview at 73.1.
15 models

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Curvebench Env?
CurveBench: Vision-language model evaluation for hierarchical tree structure extraction from images
What is the current top score on Curvebench Env?
The top reported score is 73.1% by Gemini 3.1 Pro Preview, across 15 models reporting (5 from frontier labs).
How can a model improve its Curvebench Env score?
Tools linked to Curvebench Env on Sophon include Curvebench ENV RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Curvebench Env under?
Curvebench Env is available under mit.