0

Bench ENV RL Env (Prime Community)

Fresh

τ-bench: Tool-Agent-User benchmark for conversational agents in customer service domains with user simulation

Type
RL Env
Runtime
multi-turn
License
unknown
Size
v0.1.0
Published
Mar 2026

Cite

Notes

Only stored in your browser.

Public scores on this env

1

2 vf-eval reports across 1 model

Open the scoring view →

Lift evidence

1
EvalTools known to liftSource paper
τ-bench (tau-bench)Bench ENV RL Env (Prime Community)-