BFCL
Fresh
BFCL is an evaluation of LLMs' ability to call functions and tools. The dataset represents common function calling use-cases in agents and enterprise workflows.
- Type
- RL Env
- Publisher
- General Reasoning
- Runtime
ORS- License
- unknown
- Size
- 4441 tasks
- Published
- Jan 2026
- Canonical
- openreward.ai/GeneralReasoning/BFCL
Cite
Notes
Only stored in your browser.
Public scores on this env
11 vf-eval report across 1 model