LAB Bench
The Language Agent Biology Benchmark, or LAB-Bench, is an evaluation dataset for AI systems intended to benchmark capabilities foundational to scientific research in biology. This is an implementation of a benchmark made by FutureHouse.
- Domain
- rl-env
- License
- unknown
- Published
- Jan 2026
Cite
Notes
Only stored in your browser.
Top score 58 by Claude Opus 4.6 - 1 model reporting (1 frontier)
Top models
1FAQ
- What is LAB Bench?
- The Language Agent Biology Benchmark, or LAB-Bench, is an evaluation dataset for AI systems intended to benchmark capabilities foundational to scientific research in biology. This is an implementation of a benchmark made by FutureHouse.
- What is the current top score on LAB Bench?
- The top reported score is 58 by Claude Opus 4.6, across 1 model reporting (1 from frontier labs).
- What license is LAB Bench under?
- LAB Bench is available under unknown.