0

Artificial Analysis - Intelligence Index

Aggregate Intelligence Index (0-100) over MMLU-Pro, GPQA-Diamond, HumanEval, MATH-500, and other reasoning benchmarks. Published by Artificial Analysis with per-model pricing, throughput, and latency.

Kind
Aggregated
Updates
weekly·updated 7h ago
Notable for
intelligence-index
Tracks
12 evals · aggregated

Cite

Notes

Only stored in your browser.

Intelligence ranking

Artificial Analysis - Intelligence Index · IntelligenceBar chart with 21 bars. Highest value: Claude Opus 4.8 at 61.4.
21 models

Per-eval breakdown

460

models

Model
R1 1776

Perplexity AI

------95.4%-----95.4%
o1 Preview

OpenAI

------92.4%-----92.4%
Qwen3-235B-A22B

Alibaba Qwen (Tongyi Qianwen)

85.7%-----------85.7%
o3 Pro

OpenAI

--84.5%---------84.5%
Hermes 4 (405B)

Nous Research

81.9%-----------81.9%
DeepSeek-V2.5 (Dec '24)

DeepSeek

------76.3%-----76.3%
DeepSeek-Coder-V2

DeepSeek

------74.3%-----74.3%
Gemini 3 Pro

Google (Alphabet Inc.)

-95.7%91.9%37.5%70.4%91.7%-89.8%56.1%87.1%41.7%-73.5%
Gemini 3 Flash Preview

Google (Alphabet Inc.)

-97.0%89.8%34.7%78.0%90.8%-89.0%50.6%80.4%38.6%-72.1%
o3

OpenAI

96.7%88.3%87.7%20.0%71.4%80.8%99.2%85.3%41.0%80.7%37.1%-71.7%
Grok 4

xAI

94.3%92.7%87.7%23.9%53.7%81.9%99.0%86.6%45.7%74.9%37.9%-70.7%
Gemini 3.1 Pro Preview

Google (Alphabet Inc.)

--94.1%44.7%77.1%---58.9%95.6%53.8%-70.7%
Gemini 2.5 Pro Preview (Mar' 25)

Google (Alphabet Inc.)

87.0%-83.6%17.1%-77.8%98.0%85.8%39.5%---69.8%
Gemini 2.5 Pro Preview (May' 25)

Google (Alphabet Inc.)

84.3%-82.2%15.4%-77.0%98.6%83.7%41.6%---69.0%
GPT-5 Codex

OpenAI

-98.7%83.7%25.6%74.1%84.0%-86.5%40.9%86.8%37.9%-68.7%
Claude Opus 4.8

Anthropic

--92.0%45.7%62.2%---53.5%94.4%58.3%-67.7%
Qwen3.7 Max

Alibaba

--92.3%38.1%80.5%---48.8%94.7%50.8%-67.5%
Gemini 3 Deep Think

Google DeepMind

--93.8%41.0%--------67.4%
Kimi K2 Thinking

Kimi

-94.7%83.8%22.3%68.1%85.3%-84.8%42.4%93.0%31.1%-67.3%
GPT-5.1-Codex

OpenAI

-95.7%86.0%23.4%70.0%84.9%-86.0%40.2%83.0%34.8%-67.1%
o4 Mini

OpenAI

94.0%90.7%78.4%17.5%68.7%85.9%98.9%83.2%46.5%55.6%15.2%-66.8%
GPT-5.3-Codex

OpenAI

--91.5%39.9%75.4%---53.2%86.0%53.0%-66.5%
Kimi K2.6

Moonshot AI

--91.1%35.9%76.0%---53.5%95.9%43.9%-66.1%
Qwen3.5 397B A17B

Alibaba

--89.3%27.3%78.8%--87.3%42.0%95.6%40.9%-65.9%
Muse Spark

Meta Platforms

--88.4%39.9%75.9%---51.5%91.5%45.5%-65.4%
Gemini 2.5 Pro

Google (Alphabet Inc.)

88.7%87.7%84.4%21.1%48.7%80.1%96.7%86.2%42.8%54.1%26.5%-65.2%
Grok 3 mini

xAI

93.3%84.7%79.1%11.1%45.9%69.6%99.2%82.8%40.6%90.4%17.4%-64.9%
MiniMax M2.1

Minimax

-82.7%83.0%22.2%69.9%81.0%-87.5%40.7%85.4%28.8%-64.6%
Gemini 3 Pro Preview

Google (Alphabet Inc.)

-86.7%88.7%27.6%49.7%85.7%-89.5%49.9%68.1%34.1%-64.4%
GPT-5.2-Codex

OpenAI

--89.9%33.5%77.6%---54.6%92.1%37.1%-64.1%
Qwen3 235B A22B Thinking 2507

Alibaba

94.0%91.0%79.0%15.0%51.2%78.8%98.4%84.3%42.4%53.2%13.6%-63.7%
Qwen3.6 Max Preview

Alibaba

--88.8%28.9%76.6%---46.9%95.9%43.9%-63.5%
KAT-Coder-Pro V1

KwaiKAT

-94.7%76.4%33.4%68.4%74.7%-81.3%36.6%88.6%9.1%-62.6%
GPT-5.1-Codex-Mini

OpenAI

-91.7%81.3%16.9%67.9%83.6%-82.0%42.6%62.9%33.3%-62.5%
Qwen3.6 Plus

Alibaba

--88.2%25.7%75.2%---40.7%97.7%43.9%-61.9%
MiniMax M2

Minimax

-78.3%77.7%12.5%72.3%82.6%-82.0%36.1%86.8%25.8%-61.6%
Gemini 2.5 Flash Preview (Reasoning)

Google (Alphabet Inc.)

84.3%-69.8%11.6%-50.5%98.1%80.0%35.9%---61.5%
MiMo-V2-Pro

Xiaomi

--87.0%28.3%68.8%---42.5%95.0%40.9%-60.4%
MiniMax M2.7

Minimax

--87.4%28.1%75.7%---47.0%84.8%39.4%-60.4%
GLM 5 Turbo

Zai

--84.7%25.4%73.2%---43.6%98.5%33.3%-59.8%
Claude Opus 4.5

Anthropic

-62.7%81.0%12.9%43.0%73.8%-88.9%47.0%86.3%40.9%-59.6%
GLM 5.1

Zai

--83.9%25.6%52.0%--85.4%36.1%97.1%35.6%-59.4%
GLM 4.5

Zai

87.3%73.7%78.2%12.2%44.1%73.8%97.9%83.5%34.8%43.0%22.0%-59.1%
MiMo-V2.5

Xiaomi

--84.9%25.2%67.1%---43.1%90.6%41.7%-58.8%
Claude 4.1 Opus

Anthropic

-80.3%80.9%11.9%55.4%65.4%-88.0%40.9%71.4%34.3%-58.7%
DeepSeek V3.2 Speciale

DeepSeek

-96.7%87.1%26.1%63.9%89.6%-86.3%44.0%0.0%34.8%-58.7%
ERNIE 5.0 Thinking Preview

Baidu

-85.0%77.7%12.7%41.4%81.2%-83.0%37.5%83.9%25.0%-58.6%
Qwen3.5-122B-A10B

Alibaba

--85.7%23.4%75.7%---42.0%93.6%31.1%-58.6%
o1

OpenAI

72.3%-74.7%7.7%70.3%67.9%97.0%84.1%35.8%62.6%12.9%-58.5%
Qwen3.5-27B

Alibaba

--85.8%22.2%75.6%---39.5%93.9%32.6%-58.3%
460 / 460 models

Evals tracked

12