0

TruthfulQA

817 questions targeting common human misconceptions, measuring whether a model gives factually true answers or repeats popular falsehoods.

Format
HF Dataset
Size
817 tasks
License
Apache-2.0
Published
Sep 2021
Notable for
Benchmark for evaluating hallucination and factual recall.

Cite

Notes

Only stored in your browser.

Where it's ranked

1

Related tools

1
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

Papers

2

Contributors

3

FAQ

What is TruthfulQA?
817 questions targeting common human misconceptions, measuring whether a model gives factually true answers or repeats popular falsehoods.
What capabilities does TruthfulQA test?
TruthfulQA evaluates hallucination, factual recall.
How can a model improve its TruthfulQA score?
Tools linked to TruthfulQA on Sophon include Anthropic HH-RLHF - RL environments, datasets, and scaffolds that target this eval.
What license is TruthfulQA under?
TruthfulQA is available under Apache-2.0.