What is the current top score on IDE-Bench?

The top reported score is 87.5% by Claude Sonnet 4.5, across 13 models reporting (7 from frontier labs).

IDE-Bench

Frontier

Agentic software-engineering tasks evaluated inside a real IDE development workflow, not isolated patch generation. By AfterQuery.

Publisher: AfterQuery
Domain: Coding
Published: Jun 2026
Notable for: Scores coding agents on end-to-end IDE workflows; open dataset on GitHub (AfterQuery/ide-bench).
Canonical: ide-bench.com
Official leaderboard: ide-bench.com

Cite

Notes

Only stored in your browser.

Attribution

Top score 87.5% by Claude Sonnet 4.5 - 13 models reporting (7 frontier)

IDE-BenchBar chart with 13 bars. Highest value: Claude Sonnet 4.5 at 87.5.

13 models

ide-bench.com

What is IDE-Bench?: Agentic software-engineering tasks evaluated inside a real IDE development workflow, not isolated patch generation. By AfterQuery.
What is the current top score on IDE-Bench?: The top reported score is 87.5% by Claude Sonnet 4.5, across 13 models reporting (7 from frontier labs).