0

Tau2

Active

Evaluating Conversational Agents in a Dual-Control Environment

Publisher
Sierra
Domain
Assistants
License
mit
Published
Dec 2025
Notable for
Benchmark for evaluating Assistants.

Cite

Notes

Only stored in your browser.

Related tools

4
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is Tau2?
Evaluating Conversational Agents in a Dual-Control Environment
How can a model improve its Tau2 score?
Tools linked to Tau2 on Sophon include TAU 2 Bench RL Env (Community), TAU 2 Synth RL Env (Prime), TAU 2 Bench RL Env (Prime Intellect), TAU 2 Bench RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is Tau2 under?
Tau2 is available under mit.