LAB Bench

Fresh

The Language Agent Biology Benchmark, or LAB-Bench, is an evaluation dataset for AI systems intended to benchmark capabilities foundational to scientific research in biology. This is an implementation of a benchmark made by FutureHouse.

Type: RL Env
Publisher: EdisonScientific
Tags: Scientific Research Assistance
Runtime: ORS
License: unknown
Size: 1967 tasks
Published: Jan 2026
Canonical: openreward.ai/EdisonScientific/LAB-Bench

Cite

Notes

Only stored in your browser.

Attribution

README: openreward.ai/EdisonScientific/LAB-Bench
Scores: OpenReward

Attribution policy →

Public scores on this env

2 vf-eval reports across 1 model

1Claude Opus 4.6Anthropic58

Open the scoring view →