0

ARC (AI2 Reasoning Challenge)

Grade-school science multiple-choice questions (Easy and Challenge sets) drawn from US standardized tests - an early language-understanding benchmark.

Domain
science
Format
HF Dataset
Size
7787 tasks
License
CC-BY-SA-4.0
Published
May 2026
Notable for
Benchmark for evaluating scientific reasoning and factual recall in the science domain.

Cite

Notes

Only stored in your browser.

Related tools

3
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is ARC (AI2 Reasoning Challenge)?
Grade-school science multiple-choice questions (Easy and Challenge sets) drawn from US standardized tests - an early language-understanding benchmark.
What capabilities does ARC (AI2 Reasoning Challenge) test?
ARC (AI2 Reasoning Challenge) evaluates scientific reasoning, factual recall.
How can a model improve its ARC (AI2 Reasoning Challenge) score?
Tools linked to ARC (AI2 Reasoning Challenge) on Sophon include AI 2 ARC RL Env (Community), OpenOrca, SlimOrca - RL environments, datasets, and scaffolds that target this eval.
What license is ARC (AI2 Reasoning Challenge) under?
ARC (AI2 Reasoning Challenge) is available under CC-BY-SA-4.0.