0

Taubench

Frontier

tau-bench challenges agents to coordinate, guide, and assist users in achieving shared objectives across complex enterprise domains.

Domain
rl-env
License
unknown
Published
Jan 2026

Cite

Notes

Only stored in your browser.

Attribution

Leaderboard scores
OpenReward
Attribution policy →

Top score 61.2 by GPT-4o - 12 models reporting (7 frontier)

Score history

9
0255075100Nov 22Apr 23Sep 23Feb 24Jul 24GPT-3.5 TurboGPT-4 TurboGPT-4o

Top models

12
TaubenchBar chart with 12 bars. Highest value: GPT-4o at 61.2.
12 models

FAQ

What is Taubench?
tau-bench challenges agents to coordinate, guide, and assist users in achieving shared objectives across complex enterprise domains.
What is the current top score on Taubench?
The top reported score is 61.2 by GPT-4o, across 12 models reporting (7 from frontier labs).
What license is Taubench under?
Taubench is available under unknown.