Sycophancy Eval
Saturated
Evaluate sycophancy of language models across a variety of free-form text-generation tasks.
- Publisher
- Anthropic
- Domain
- Assistants
- License
- mit
- Published
- Nov 2024
- Notable for
- Benchmark for evaluating Assistants.
Cite
Notes
Only stored in your browser.
Top score 100.0% by GPT-4o-mini - 2 models reporting (2 frontier)
Score history
2Top models
2Related tools
3Implementations, trainers, datasets and scaffolds linked to this eval.
FAQ
- What is Sycophancy Eval?
- Evaluate sycophancy of language models across a variety of free-form text-generation tasks.
- What is the current top score on Sycophancy Eval?
- The top reported score is 100.0% by GPT-4o-mini, across 2 models reporting (2 from frontier labs).
- How can a model improve its Sycophancy Eval score?
- Tools linked to Sycophancy Eval on Sophon include Sycophancy ENV RL Env (Community), Sycophancy EVAL RL Env (Prime Community), Sycophancy EVAL RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
- What license is Sycophancy Eval under?
- Sycophancy Eval is available under mit.