What is the current top score on IFBench?

The top reported score is 82.9% by MiniMax M3, across 348 models reporting (76 from frontier labs).

How can a model improve its IFBench score?

Tools linked to IFBench on Sophon include Ifbench RL Env (Community), Ifbench RL Env (Prime Intellect), Ifbench RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.

IFBench

Frontier

Instruction-following benchmark measuring adherence to multi-step constraints.

Open

Publisher: Allen Institute for AI
Published: Jun 2025
Canonical: github.com/allenai/IFBench

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores: AA

Attribution policy →

Top score 82.9% by MiniMax M3 - 348 models reporting (76 frontier)

Score history

348

Top models

348

IFBenchBar chart with 21 bars. Highest value: MiniMax M3 at 82.9.

21 models

Related tools

View all

Implementations, trainers, datasets and scaffolds linked to this eval.

Ifbench RL Env (Community)

IFBench evaluation environment

ImplementationRL EnvIfbench

Ifbench RL Env (Prime Intellect)

Prime Intellect

IFBench evaluation environment

ImplementationRL EnvIfbench

Ifbench RL Env (Community)

IFBench evaluation environment for precise instruction following with verifiable constraints

ImplementationRL EnvInstruction FollowingVerificationConstraints

FAQ

What is IFBench?: Instruction-following benchmark measuring adherence to multi-step constraints.
What is the current top score on IFBench?: The top reported score is 82.9% by MiniMax M3, across 348 models reporting (76 from frontier labs).
How can a model improve its IFBench score?: Tools linked to IFBench on Sophon include Ifbench RL Env (Community), Ifbench RL Env (Prime Intellect), Ifbench RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.