0

Brierbench

Fresh

BrierBench evaluates LLM forecasting accuracy on real-world binary questions from prediction markets and data sources. Agents receive a question, search the web for information, advance through simulated time, and submit probability predictions scored with a time-weighted Brie…

Type
RL Env
Runtime
ORS
License
unknown
Size
1804 tasks
Published
Mar 2026

Cite

Notes

Only stored in your browser.

Contributors

1