What is the current top score on Sycophancy Eval?

The top reported score is 100.0% by GPT-4o-mini, across 2 models reporting (2 from frontier labs).

How can a model improve its Sycophancy Eval score?

Tools linked to Sycophancy Eval on Sophon include Sycophancy ENV RL Env (Community), Sycophancy EVAL RL Env (Prime Community), Sycophancy EVAL RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.

What license is Sycophancy Eval under?

Sycophancy Eval is available under mit.

Sycophancy Eval

Saturated

Evaluate sycophancy of language models across a variety of free-form text-generation tasks.

Open

Publisher: Anthropic
Domain: Assistants
License: mit
Published: Nov 2024
Notable for: Benchmark for evaluating Assistants.
Canonical: github.com/UKGovernmentBEIS/inspect_evals/tree/main/src/inspect_evals/sycophancy

Cite

Notes

Only stored in your browser.

Attribution

README: github.com/UKGovernmentBEIS/inspect_evals/blob/main/src/inspect_evals/sycophancy/README.mdMIT
Leaderboard scores: prime-hub

Attribution policy →

Top score 100.0% by GPT-4o-mini - 2 models reporting (2 frontier)

Score history

Top models

Sycophancy EvalBar chart with 2 bars. Highest value: GPT-4.1 Mini at 100.

2 models

Related tools

View all

Implementations, trainers, datasets and scaffolds linked to this eval.

Sycophancy ENV RL Env (Community)

Environment for detecting and evaluating sycophancy in LLMs using verifiers.

Trains towardRL Env

Sycophancy EVAL RL Env (Prime Community)

Prime Community

Evaluates sycophantic behavior in LLMs across four tasks from Sharma et al. (ICLR 2024).

Trains towardRL EnvSycophancyBiasLanguage Models

Sycophancy EVAL RL Env (Prime Intellect)

Prime Intellect

Evaluates sycophantic behavior in LLMs across four tasks from Sharma et al. (ICLR 2024).

Trains towardRL EnvSycophancyBiasLanguage Models

FAQ

What is Sycophancy Eval?: Evaluate sycophancy of language models across a variety of free-form text-generation tasks.
What is the current top score on Sycophancy Eval?: The top reported score is 100.0% by GPT-4o-mini, across 2 models reporting (2 from frontier labs).
How can a model improve its Sycophancy Eval score?: Tools linked to Sycophancy Eval on Sophon include Sycophancy ENV RL Env (Community), Sycophancy EVAL RL Env (Prime Community), Sycophancy EVAL RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
What license is Sycophancy Eval under?: Sycophancy Eval is available under mit.