What capabilities does RewardBench 2 test?

RewardBench 2 evaluates llm judging, safety.

How can a model improve its RewardBench 2 score?

Tools linked to RewardBench 2 on Sophon include Reward Bench RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.

What license is RewardBench 2 under?

RewardBench 2 is available under ODC-BY-1.0.

RewardBench 2

Active

2025 successor to RewardBench - harder, multi-completion (not just chosen-vs-rejected), with refreshed prompts to address contamination.

Open

Publisher: Allen Institute for AI (Ai2)
Capabilities: LLM Judging Safety
Format: HF Dataset
Size: 1865 tasks
License: ODC-BY-1.0
Published: Jun 2025
Notable for: Benchmark for evaluating llm judging and safety.
Canonical: github.com/allenai/reward-bench
Also on: huggingface.co/datasets/allenai/reward-bench-2

Cite

Notes

Only stored in your browser.

Related tools

View all

Implementations, trainers, datasets and scaffolds linked to this eval.

Reward Bench RL Env (Prime Intellect)

Prime Intellect

Evaluates pair-wise answers from RewardBench datasets

Trains towardRL EnvMulti LingualReward BenchSafety

Papers

RewardBench 2: Advancing Reward Model Evaluation

preprint · 2025

Allen AI's expanded benchmark for reward models and LLM judges, with explicit reward-hacking probes that surface judges fooled by length, formatting, sycophancy, or self-preference.

introduces

RewardBench 2: Advancing Reward Model Evaluation

preprint · 2025

Allen AI's expanded benchmark for reward models and LLM judges, with explicit reward-hacking probes that surface judges fooled by length, formatting, sycophancy, or self-preference.

Contributors

NNathan Lambert

FAQ

What is RewardBench 2?: 2025 successor to RewardBench - harder, multi-completion (not just chosen-vs-rejected), with refreshed prompts to address contamination.
What capabilities does RewardBench 2 test?: RewardBench 2 evaluates llm judging, safety.
How can a model improve its RewardBench 2 score?: Tools linked to RewardBench 2 on Sophon include Reward Bench RL Env (Prime Intellect) - RL environments, datasets, and scaffolds that target this eval.
What license is RewardBench 2 under?: RewardBench 2 is available under ODC-BY-1.0.