0

BFCL

Fresh

BFCL is an evaluation of LLMs' ability to call functions and tools. The dataset represents common function calling use-cases in agents and enterprise workflows.

Type
RL Env
Runtime
ORS
License
unknown
Size
4441 tasks
Published
Jan 2026

Cite

Notes

Only stored in your browser.

Public scores on this env

1

1 vf-eval report across 1 model

Open the scoring view →