0

hallucination

Slug
hallucination
Evals
5
Tools
7
Models
1
Papers
3

Evals testing this capability

5
View all

Tools lifting evals here

7
View all

Top models on this capability

1

by avg parsed score across evals here

hallucinationBar chart with 1 bar. Highest value: GPT-4.1 Mini at 33.3.
1 model

Papers in this area

3