0

Sycophancy Eval

Saturated

Evaluate sycophancy of language models across a variety of free-form text-generation tasks.

Publisher
Anthropic
Domain
Assistants
License
mit
Published
Nov 2024
Notable for
Benchmark for evaluating Assistants.

Cite

Notes

Only stored in your browser.

Top score 100.0% by GPT-4o-mini - 2 models reporting (2 frontier)

Score history

2
95%96%98%99%100%Jul 24Sep 24Nov 24Jan 25Mar 25GPT-4o-mini

Top models

2
Sycophancy EvalBar chart with 2 bars. Highest value: GPT-4.1 Mini at 100.
2 models

Related tools

3
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Sycophancy Eval?
Evaluate sycophancy of language models across a variety of free-form text-generation tasks.
What is the current top score on Sycophancy Eval?
The top reported score is 100.0% by GPT-4o-mini, across 2 models reporting (2 from frontier labs).
How can a model improve its Sycophancy Eval score?
Tools linked to Sycophancy Eval on Sophon include Sycophancy ENV RL Env (Community), Sycophancy EVAL RL Env (Prime Community), Sycophancy EVAL RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
What license is Sycophancy Eval under?
Sycophancy Eval is available under mit.