LiveBench
Frontier
Rolling contamination-free benchmark that updates questions monthly across math, coding, reasoning, language, instruction-following, and data analysis.
- Publisher
- Meta FAIR (Fundamental AI Research)
- Format
- Custom
- Size
- 1000 tasks
- License
- Apache-2.0
- Published
- Jun 2024
- Notable for
- Benchmark for evaluating math, code generation and instruction following.
- Canonical
- livebench.ai
Cite
Notes
Only stored in your browser.
Top score 82.4% by Gemini 3.1 Pro Preview - 137 models reporting (59 frontier)
Score history
100Top models
137Where it's ranked
1Contributors
1FAQ
- What is LiveBench?
- Rolling contamination-free benchmark that updates questions monthly across math, coding, reasoning, language, instruction-following, and data analysis.
- What capabilities does LiveBench test?
- LiveBench evaluates math, code generation, instruction following, factual recall.
- What is the current top score on LiveBench?
- The top reported score is 82.4% by Gemini 3.1 Pro Preview, across 137 models reporting (59 from frontier labs).
- What license is LiveBench under?
- LiveBench is available under Apache-2.0.
