SWE-bench Multilingual
Frontier
Cross-language extension of SWE-bench Verified - real GitHub issues across multiple programming languages.
- Publisher
- Princeton University
- Published
- May 2026
- Canonical
- swebench.com/multilingual.html
Cite
Notes
Only stored in your browser.
Top score 72.7% by Gemini 3 Flash - 11 models reporting (7 frontier)
Score history
11Top models
11Related tools
2Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is SWE-bench Multilingual?
- Cross-language extension of SWE-bench Verified - real GitHub issues across multiple programming languages.
- What is the current top score on SWE-bench Multilingual?
- The top reported score is 72.7% by Gemini 3 Flash, across 11 models reporting (7 from frontier labs).
- How can a model improve its SWE-bench Multilingual score?
- Tools linked to SWE-bench Multilingual on Sophon include Agent Bench RL Env (Prime Community), SWE RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
