0

LiveCodeBench

Frontier

Rolling competitive-programming benchmark that scrapes LeetCode / AtCoder / Codeforces problems after a known cutoff to fight contamination.

Open
Domain
code
Format
Custom
Size
1055 tasks
License
MIT
Published
Mar 2024
Updates
Monthly
Notable for
The reference contamination-free coding leaderboard — problems are date-stamped so each model is only scored on problems released after its training cutoff.

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
AAprime-hub
Attribution policy →

Top score 91.7% by Gemini 3 Pro - 279 models reporting (62 frontier)

Score history

279
0%25%50%75%100%Mar 23Oct 23May 24Dec 24Jul 25Claude InstantClaude 2.0GPT-4 TurboGPT-4o (2024-05-13)o1 Minio1Gemini 2.5 Pro Preview (Mar' 25)Gemini 3 Pro

Top models

279
LiveCodeBenchBar chart with 21 bars. Highest value: Gemini 3 Pro at 91.7.
21 models

Where it's ranked

2

Related tools

2
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

Papers

2

Contributors

1

FAQ

What is LiveCodeBench?
Rolling competitive-programming benchmark that scrapes LeetCode / AtCoder / Codeforces problems after a known cutoff to fight contamination.
What capabilities does LiveCodeBench test?
LiveCodeBench evaluates code generation, debugging.
What is the current top score on LiveCodeBench?
The top reported score is 91.7% by Gemini 3 Pro, across 279 models reporting (62 from frontier labs).
How can a model improve its LiveCodeBench score?
Tools linked to LiveCodeBench on Sophon include Livecodebench RL Env (Prime Intellect), OpenThoughts - RL environments, datasets, and scaffolds that target this eval.
What license is LiveCodeBench under?
LiveCodeBench is available under MIT.