Xbench Scienceqa RL Env (Community)
Fresh
A science question answering environment for evaluating scientific reasoning and problem-solving capabilities.
- Type
- RL Env
- Runtime
single-turn- License
- unknown
- Size
- v0.1.1
- Published
- Nov 2025
Cite
Notes
Only stored in your browser.
Public scores on this env
55 vf-eval reports across 5 models
1Gemini 2.5 ProGoogle (Alphabet Inc.)53.3%2Gemini Flash LatestGoogle (Alphabet Inc.)46.7%3GPT-5 NanoOpenAI40.0%4gpt-oss-20bOpenAI20.0%5Gemini Flash Lite LatestGoogle (Alphabet Inc.)6.7%
Open the scoring view →