Artificial Analysis - Intelligence Index
Aggregate Intelligence Index (0-100) over MMLU-Pro, GPQA-Diamond, HumanEval, MATH-500, and other reasoning benchmarks. Published by Artificial Analysis with per-model pricing, throughput, and latency.
- Operator
- Artificial Analysis
- Kind
- Aggregated
- Updates
- weekly·updated 7h ago
- Notable for
- intelligence-index
- Tracks
- 12 evals · aggregated
Cite
Notes
Only stored in your browser.
Intelligence ranking
Per-eval breakdown
460models
| Model | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R1 1776 Perplexity AI | - | - | - | - | - | - | 95.4% | - | - | - | - | - | 95.4% |
| o1 Preview OpenAI | - | - | - | - | - | - | 92.4% | - | - | - | - | - | 92.4% |
| Qwen3-235B-A22B Alibaba Qwen (Tongyi Qianwen) | 85.7% | - | - | - | - | - | - | - | - | - | - | - | 85.7% |
| o3 Pro OpenAI | - | - | 84.5% | - | - | - | - | - | - | - | - | - | 84.5% |
| Hermes 4 (405B) Nous Research | 81.9% | - | - | - | - | - | - | - | - | - | - | - | 81.9% |
| DeepSeek-V2.5 (Dec '24) DeepSeek | - | - | - | - | - | - | 76.3% | - | - | - | - | - | 76.3% |
| DeepSeek-Coder-V2 DeepSeek | - | - | - | - | - | - | 74.3% | - | - | - | - | - | 74.3% |
| Gemini 3 Pro Google (Alphabet Inc.) | - | 95.7% | 91.9% | 37.5% | 70.4% | 91.7% | - | 89.8% | 56.1% | 87.1% | 41.7% | - | 73.5% |
| Gemini 3 Flash Preview Google (Alphabet Inc.) | - | 97.0% | 89.8% | 34.7% | 78.0% | 90.8% | - | 89.0% | 50.6% | 80.4% | 38.6% | - | 72.1% |
| o3 OpenAI | 96.7% | 88.3% | 87.7% | 20.0% | 71.4% | 80.8% | 99.2% | 85.3% | 41.0% | 80.7% | 37.1% | - | 71.7% |
| Grok 4 xAI | 94.3% | 92.7% | 87.7% | 23.9% | 53.7% | 81.9% | 99.0% | 86.6% | 45.7% | 74.9% | 37.9% | - | 70.7% |
| Gemini 3.1 Pro Preview Google (Alphabet Inc.) | - | - | 94.1% | 44.7% | 77.1% | - | - | - | 58.9% | 95.6% | 53.8% | - | 70.7% |
| Gemini 2.5 Pro Preview (Mar' 25) Google (Alphabet Inc.) | 87.0% | - | 83.6% | 17.1% | - | 77.8% | 98.0% | 85.8% | 39.5% | - | - | - | 69.8% |
| Gemini 2.5 Pro Preview (May' 25) Google (Alphabet Inc.) | 84.3% | - | 82.2% | 15.4% | - | 77.0% | 98.6% | 83.7% | 41.6% | - | - | - | 69.0% |
| GPT-5 Codex OpenAI | - | 98.7% | 83.7% | 25.6% | 74.1% | 84.0% | - | 86.5% | 40.9% | 86.8% | 37.9% | - | 68.7% |
| Claude Opus 4.8 Anthropic | - | - | 92.0% | 45.7% | 62.2% | - | - | - | 53.5% | 94.4% | 58.3% | - | 67.7% |
| Qwen3.7 Max Alibaba | - | - | 92.3% | 38.1% | 80.5% | - | - | - | 48.8% | 94.7% | 50.8% | - | 67.5% |
| Gemini 3 Deep Think Google DeepMind | - | - | 93.8% | 41.0% | - | - | - | - | - | - | - | - | 67.4% |
| Kimi K2 Thinking Kimi | - | 94.7% | 83.8% | 22.3% | 68.1% | 85.3% | - | 84.8% | 42.4% | 93.0% | 31.1% | - | 67.3% |
| GPT-5.1-Codex OpenAI | - | 95.7% | 86.0% | 23.4% | 70.0% | 84.9% | - | 86.0% | 40.2% | 83.0% | 34.8% | - | 67.1% |
| o4 Mini OpenAI | 94.0% | 90.7% | 78.4% | 17.5% | 68.7% | 85.9% | 98.9% | 83.2% | 46.5% | 55.6% | 15.2% | - | 66.8% |
| GPT-5.3-Codex OpenAI | - | - | 91.5% | 39.9% | 75.4% | - | - | - | 53.2% | 86.0% | 53.0% | - | 66.5% |
| Kimi K2.6 Moonshot AI | - | - | 91.1% | 35.9% | 76.0% | - | - | - | 53.5% | 95.9% | 43.9% | - | 66.1% |
| Qwen3.5 397B A17B Alibaba | - | - | 89.3% | 27.3% | 78.8% | - | - | 87.3% | 42.0% | 95.6% | 40.9% | - | 65.9% |
| Muse Spark Meta Platforms | - | - | 88.4% | 39.9% | 75.9% | - | - | - | 51.5% | 91.5% | 45.5% | - | 65.4% |
| Gemini 2.5 Pro Google (Alphabet Inc.) | 88.7% | 87.7% | 84.4% | 21.1% | 48.7% | 80.1% | 96.7% | 86.2% | 42.8% | 54.1% | 26.5% | - | 65.2% |
| Grok 3 mini xAI | 93.3% | 84.7% | 79.1% | 11.1% | 45.9% | 69.6% | 99.2% | 82.8% | 40.6% | 90.4% | 17.4% | - | 64.9% |
| MiniMax M2.1 Minimax | - | 82.7% | 83.0% | 22.2% | 69.9% | 81.0% | - | 87.5% | 40.7% | 85.4% | 28.8% | - | 64.6% |
| Gemini 3 Pro Preview Google (Alphabet Inc.) | - | 86.7% | 88.7% | 27.6% | 49.7% | 85.7% | - | 89.5% | 49.9% | 68.1% | 34.1% | - | 64.4% |
| GPT-5.2-Codex OpenAI | - | - | 89.9% | 33.5% | 77.6% | - | - | - | 54.6% | 92.1% | 37.1% | - | 64.1% |
| Qwen3 235B A22B Thinking 2507 Alibaba | 94.0% | 91.0% | 79.0% | 15.0% | 51.2% | 78.8% | 98.4% | 84.3% | 42.4% | 53.2% | 13.6% | - | 63.7% |
| Qwen3.6 Max Preview Alibaba | - | - | 88.8% | 28.9% | 76.6% | - | - | - | 46.9% | 95.9% | 43.9% | - | 63.5% |
| KAT-Coder-Pro V1 KwaiKAT | - | 94.7% | 76.4% | 33.4% | 68.4% | 74.7% | - | 81.3% | 36.6% | 88.6% | 9.1% | - | 62.6% |
| GPT-5.1-Codex-Mini OpenAI | - | 91.7% | 81.3% | 16.9% | 67.9% | 83.6% | - | 82.0% | 42.6% | 62.9% | 33.3% | - | 62.5% |
| Qwen3.6 Plus Alibaba | - | - | 88.2% | 25.7% | 75.2% | - | - | - | 40.7% | 97.7% | 43.9% | - | 61.9% |
| MiniMax M2 Minimax | - | 78.3% | 77.7% | 12.5% | 72.3% | 82.6% | - | 82.0% | 36.1% | 86.8% | 25.8% | - | 61.6% |
| Gemini 2.5 Flash Preview (Reasoning) Google (Alphabet Inc.) | 84.3% | - | 69.8% | 11.6% | - | 50.5% | 98.1% | 80.0% | 35.9% | - | - | - | 61.5% |
| MiMo-V2-Pro Xiaomi | - | - | 87.0% | 28.3% | 68.8% | - | - | - | 42.5% | 95.0% | 40.9% | - | 60.4% |
| MiniMax M2.7 Minimax | - | - | 87.4% | 28.1% | 75.7% | - | - | - | 47.0% | 84.8% | 39.4% | - | 60.4% |
| GLM 5 Turbo Zai | - | - | 84.7% | 25.4% | 73.2% | - | - | - | 43.6% | 98.5% | 33.3% | - | 59.8% |
| Claude Opus 4.5 Anthropic | - | 62.7% | 81.0% | 12.9% | 43.0% | 73.8% | - | 88.9% | 47.0% | 86.3% | 40.9% | - | 59.6% |
| GLM 5.1 Zai | - | - | 83.9% | 25.6% | 52.0% | - | - | 85.4% | 36.1% | 97.1% | 35.6% | - | 59.4% |
| GLM 4.5 Zai | 87.3% | 73.7% | 78.2% | 12.2% | 44.1% | 73.8% | 97.9% | 83.5% | 34.8% | 43.0% | 22.0% | - | 59.1% |
| MiMo-V2.5 Xiaomi | - | - | 84.9% | 25.2% | 67.1% | - | - | - | 43.1% | 90.6% | 41.7% | - | 58.8% |
| Claude 4.1 Opus Anthropic | - | 80.3% | 80.9% | 11.9% | 55.4% | 65.4% | - | 88.0% | 40.9% | 71.4% | 34.3% | - | 58.7% |
| DeepSeek V3.2 Speciale DeepSeek | - | 96.7% | 87.1% | 26.1% | 63.9% | 89.6% | - | 86.3% | 44.0% | 0.0% | 34.8% | - | 58.7% |
| ERNIE 5.0 Thinking Preview Baidu | - | 85.0% | 77.7% | 12.7% | 41.4% | 81.2% | - | 83.0% | 37.5% | 83.9% | 25.0% | - | 58.6% |
| Qwen3.5-122B-A10B Alibaba | - | - | 85.7% | 23.4% | 75.7% | - | - | - | 42.0% | 93.6% | 31.1% | - | 58.6% |
| o1 OpenAI | 72.3% | - | 74.7% | 7.7% | 70.3% | 67.9% | 97.0% | 84.1% | 35.8% | 62.6% | 12.9% | - | 58.5% |
| Qwen3.5-27B Alibaba | - | - | 85.8% | 22.2% | 75.6% | - | - | - | 39.5% | 93.9% | 32.6% | - | 58.3% |
460 / 460 models
