Brierbench
Fresh
BrierBench evaluates LLM forecasting accuracy on real-world binary questions from prediction markets and data sources. Agents receive a question, search the web for information, advance through simulated time, and submit probability predictions scored with a time-weighted Brie…
- Type
- RL Env
- Runtime
ORS- License
- unknown
- Size
- 1804 tasks
- Published
- Mar 2026
- Canonical
- openreward.ai/djrhails/brierbench
Cite
Notes
Only stored in your browser.