SciCode
Frontier
80 expert-curated scientific coding problems (PDE solvers, quantum simulation, condensed-matter calculations) with hidden test cases.
- Publisher
- University of California, Berkeley
- Capabilities
- Code GenerationScientific Reasoning
- Domain
- science
- Format
- HF Dataset
- Size
- 80 tasks
- License
- Apache-2.0
- Published
- Jul 2024
- Notable for
- Benchmark for evaluating code generation and scientific reasoning in the science domain.
- Canonical
- scicode-bench.github.io
Cite
Notes
Only stored in your browser.
Top score 58.9% by Gemini 3.1 Pro Preview - 387 models reporting (79 frontier)
Score history
387Top models
387Related tools
2Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is SciCode?
- 80 expert-curated scientific coding problems (PDE solvers, quantum simulation, condensed-matter calculations) with hidden test cases.
- What capabilities does SciCode test?
- SciCode evaluates code generation, scientific reasoning.
- What is the current top score on SciCode?
- The top reported score is 58.9% by Gemini 3.1 Pro Preview, across 387 models reporting (79 from frontier labs).
- How can a model improve its SciCode score?
- Tools linked to SciCode on Sophon include Scicode RL Env (Prime Community), Scicode RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
- What license is SciCode under?
- SciCode is available under Apache-2.0.
