0

SWE Bench Multilingual

A benchmark of 300 software engineering tasks across 42 repositories and 9 programming languages: C, C++, Go, Java, JavaScript, TypeScript, PHP, Ruby, and Rust. Each instance is derived from a real GitHub pull request, following the same format and evaluation protocol as SWE-b…

Domain
rl-env
License
unknown
Published
Jan 2026

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
OpenReward
Attribution policy →

Top score 74.1 by MiniMax M2.5 - 8 models reporting (2 frontier)

Score history

3
0255075100Feb 26Mar 26Apr 26GLM 5MiniMax M2.5

Top models

8
SWE Bench MultilingualBar chart with 8 bars. Highest value: Claude Mythos Preview at 87.3.
8 models

FAQ

What is SWE Bench Multilingual?
A benchmark of 300 software engineering tasks across 42 repositories and 9 programming languages: C, C++, Go, Java, JavaScript, TypeScript, PHP, Ruby, and Rust. Each instance is derived from a real GitHub pull request, following the same format and evaluation protocol as SWE-b…
What is the current top score on SWE Bench Multilingual?
The top reported score is 74.1 by MiniMax M2.5, across 8 models reporting (2 from frontier labs).
What license is SWE Bench Multilingual under?
SWE Bench Multilingual is available under unknown.