What capabilities does APEX test?

APEX evaluates instruction following, factual recall, llm judging.

How can a model improve its APEX score?

Tools linked to APEX on Sophon include APEX Agents RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.

What license is APEX under?

APEX is available under Closed.

APEX

Active

Mercor's expert-graded eval - domain experts (doctors, lawyers, engineers) grade model responses on long-form professional tasks they would actually be paid to do.

Open

Publisher: Mercor
Capabilities: Instruction Following Factual Recall LLM Judging
Format: Manual
License: Closed
Published: Sep 2025
Notable for: Benchmark for evaluating instruction following, factual recall and llm judging.
Canonical: mercor.com/apex

Cite

Notes

Only stored in your browser.

Related tools

View all

Implementations, trainers, datasets and scaffolds linked to this eval.

APEX Agents RL Env (Community)

APEX-Agents benchmark: 480 professional services tasks across Law, Investment Banking, and Management Consulting

Trains towardRL EnvTool UseAgenticLaw

Papers

APEX: An Expert-Authored Benchmark for Real-World Expert Workflows

preprint · 2025

Mercor's benchmark of high-difficulty, expert-authored tasks drawn from real professional workflows (consulting, finance, legal, medical research), graded by domain experts.

introduces

APEX: An Expert-Authored Benchmark for Real-World Expert Workflows

preprint · 2025

Mercor's benchmark of high-difficulty, expert-authored tasks drawn from real professional workflows (consulting, finance, legal, medical research), graded by domain experts.

FAQ

What is APEX?: Mercor's expert-graded eval - domain experts (doctors, lawyers, engineers) grade model responses on long-form professional tasks they would actually be paid to do.
What capabilities does APEX test?: APEX evaluates instruction following, factual recall, llm judging.
How can a model improve its APEX score?: Tools linked to APEX on Sophon include APEX Agents RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is APEX under?: APEX is available under Closed.