Binary LIAR RL Env (Community)
Fresh
Single-turn number guessing environment with probe tool and noisy hints, designed for calibration and tool-use evaluation.
- Type
- RL Env
- Runtime
single-turn- License
- mit
- Size
- v0.1.0
- Published
- Oct 2025
Cite
Notes
Only stored in your browser.
Public scores on this env
66 vf-eval reports across 6 models
1GPT-5 MiniOpenAI39.212Qwen3 235B A22BAlibaba38.963GPT-5 NanoOpenAI20.414Gemini 2.5 Flash Lite Preview 09-2025Google (Alphabet Inc.)18.445Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning)Google (Alphabet Inc.)15.016GLM 4.6Zai-13.73
Open the scoring view →