SWE Atlas QnA
Codebase QnA is the first benchmark in the SWE-Atlas suite. It evaluates AI agents on deep code comprehension - tracing execution paths, explaining architectural decisions, and answering deeply technical questions about production-grade software systems.
- Domain
- rl-env
- License
- unknown
- Published
- Apr 2026
Cite
Notes
Only stored in your browser.
Top score 40.8 by GPT-5.4 - 2 models reporting (1 frontier)
Top models
2FAQ
- What is SWE Atlas QnA?
- Codebase QnA is the first benchmark in the SWE-Atlas suite. It evaluates AI agents on deep code comprehension - tracing execution paths, explaining architectural decisions, and answering deeply technical questions about production-grade software systems.
- What is the current top score on SWE Atlas QnA?
- The top reported score is 40.8 by GPT-5.4, across 2 models reporting (1 from frontier labs).
- What license is SWE Atlas QnA under?
- SWE Atlas QnA is available under unknown.