TruthfulQA
817 questions targeting common human misconceptions, measuring whether a model gives factually true answers or repeats popular falsehoods.
- Capabilities
- HallucinationFactual Recall
- Format
- HF Dataset
- Size
- 817 tasks
- License
- Apache-2.0
- Published
- Sep 2021
- Notable for
- Benchmark for evaluating hallucination and factual recall.
- Canonical
- github.com/sylinrl/TruthfulQA
Cite
Notes
Only stored in your browser.
Where it's ranked
1Related tools
1Implementations, trainers, datasets and scaffolds linked to this eval.
Papers
2Contributors
3FAQ
- What is TruthfulQA?
- 817 questions targeting common human misconceptions, measuring whether a model gives factually true answers or repeats popular falsehoods.
- What capabilities does TruthfulQA test?
- TruthfulQA evaluates hallucination, factual recall.
- How can a model improve its TruthfulQA score?
- Tools linked to TruthfulQA on Sophon include Anthropic HH-RLHF - RL environments, datasets, and scaffolds that target this eval.
- What license is TruthfulQA under?
- TruthfulQA is available under Apache-2.0.