HARD ENV RL Env (Community)
Fresh
CurveBench-Hard: Vision-language model evaluation for hierarchical tree structure extraction from complex images
- Type
- RL Env
- License
- mit
- Size
- v0.1.1
- Published
- Feb 2026
Cite
Notes
Only stored in your browser.
Attribution
- README
- api.primeintellect.ai/api/v1/environmentshub/amirmohseni/curvebench-hard-env/@0.1.1/inspectMIT
- Scores
- prime-hub
Public scores on this env
1515 vf-eval reports across 15 models
1Gemini 3.1 Pro PreviewGoogle (Alphabet Inc.)22.8%2Gemini 3 Pro PreviewGoogle (Alphabet Inc.)19.2%3Gemini 3 Flash PreviewGoogle (Alphabet Inc.)11.5%4Qwen3 VL 8B Only TreeAlibaba9.5%5GPT-5.2OpenAI9.4%6Qwen3 VL 235B A22B ThinkingAlibaba9.1%7GPT-5.4OpenAI9.0%8Qwen3 VL 8B Region TreeAlibaba7.9%9Claude Opus 4.5Anthropic6.1%10Gemma 3 12B Region TreeGoogle (Alphabet Inc.)6.1%11Qwen3 VL 8BAlibaba5.4%12Qwen3 VL 8B InstructAlibaba5.2%13GPT-5.4 MiniOpenAI3.9%14GPT-5 MiniOpenAI2.8%15Gemma 3 12BGoogle (Alphabet Inc.)2.1%
Open the scoring view →