0

LAB-Bench: Measuring Capabilities of Language Models for Biology Research

Active

Tests LLMs and LLM-augmented agents abilities to answer questions on scientific research workflows in domains like chemistry, biology, materials science, as well as more general science tasks

Open
Publisher
FutureHouse
Domain
Safeguards
License
mit
Published
Feb 2025
Notable for
Benchmark for evaluating Safeguards.

Cite

Notes

Only stored in your browser.

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is LAB-Bench: Measuring Capabilities of Language Models for Biology Research?
Tests LLMs and LLM-augmented agents abilities to answer questions on scientific research workflows in domains like chemistry, biology, materials science, as well as more general science tasks
How can a model improve its LAB-Bench: Measuring Capabilities of Language Models for Biology Research score?
Tools linked to LAB-Bench: Measuring Capabilities of Language Models for Biology Research on Sophon include Biomni ENV RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is LAB-Bench: Measuring Capabilities of Language Models for Biology Research under?
LAB-Bench: Measuring Capabilities of Language Models for Biology Research is available under mit.