EdisonScientific is an org.
Cite
Notes
Only stored in your browser.
Bioinformatics Benchmark (BixBench) is a dataset comprising over 50 real-world scenarios of practical biological data analysis with nearly 300 associated open-answer questions designed to measure the ability of LLM-based agents to explore biological datasets, perform long, mul…
The dataset used to test the ether0 scientific reasoning model.
The Language Agent Biology Benchmark, or LAB-Bench, is an evaluation dataset for AI systems intended to benchmark capabilities foundational to scientific research in biology. This is an implementation of a benchmark made by FutureHouse.