Curvebench ENV RL Env (Community)
Fresh
CurveBench: Vision-language model evaluation for hierarchical tree structure extraction from images
- Type
- RL Env
- License
- mit
- Size
- v0.1.4
- Published
- Feb 2026
Cite
Notes
Only stored in your browser.
Public scores on this env
1515 vf-eval reports across 15 models
1Gemini 3.1 Pro PreviewGoogle (Alphabet Inc.)73.1%2Gemini 3 Pro PreviewGoogle (Alphabet Inc.)67.7%3GPT-5.2OpenAI40.6%4Qwen3 VL 8B Region TreeAlibaba39.7%5Qwen3 VL 235B A22B ThinkingAlibaba39.4%6Qwen3 VL 8B Only TreeAlibaba36.2%7Claude Opus 4.5Anthropic35.6%8GPT-5.4OpenAI34.1%9Gemma 3 12B Region TreeGoogle (Alphabet Inc.)29.1%10GPT-5.4 MiniOpenAI21.2%11GPT-5 MiniOpenAI18.1%12Gemma 3 27BGoogle (Alphabet Inc.)13.4%13Qwen3 VL 8B InstructAlibaba10.8%14Gemma 3 12BGoogle (Alphabet Inc.)10.1%15Qwen3 VL 8BAlibaba3.8%
Open the scoring view →