Brierbench

Fresh

BrierBench evaluates LLM forecasting accuracy on real-world binary questions from prediction markets and data sources. Agents receive a question, search the web for information, advance through simulated time, and submit probability predictions scored with a time-weighted Brie…

Type: RL Env
Tags: Time Series Forecasting Long Horizon Events Forecasting Question Answering
Runtime: ORS
License: unknown
Size: 1804 tasks
Published: Mar 2026
Canonical: openreward.ai/djrhails/brierbench

Cite

Notes

Only stored in your browser.

Attribution

README: openreward.ai/djrhails/brierbench

Attribution policy →

Contributors

Daniel Hails