0

Discovery30s

Discovery30s is a benchmark that tests the potential of vintage language models to reproduce scientific discoveries after the training data cutoff period. We construct the benchmark by taking a known discovery, e.g. Hückel's rule, and then breaking it down into a "question lad…

Domain
rl-env
License
unknown
Published
Feb 2026

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
OpenReward
Attribution policy →

Top models

1
Discovery30sBar chart with 1 bar. Highest value: GPT-5.2 (high) [modern day hindsight] at 98.
1 model

FAQ

What is Discovery30s?
Discovery30s is a benchmark that tests the potential of vintage language models to reproduce scientific discoveries after the training data cutoff period. We construct the benchmark by taking a known discovery, e.g. Hückel's rule, and then breaking it down into a "question lad…
What license is Discovery30s under?
Discovery30s is available under unknown.