BFCL

Fresh

BFCL is an evaluation of LLMs' ability to call functions and tools. The dataset represents common function calling use-cases in agents and enterprise workflows.

Type: RL Env
Publisher: General Reasoning
Tags: Function Calling Evaluation
Runtime: ORS
License: unknown
Size: 4441 tasks
Published: Jan 2026
Canonical: openreward.ai/GeneralReasoning/BFCL

Cite

Notes

Only stored in your browser.

Attribution

README: openreward.ai/GeneralReasoning/BFCL
Scores: OpenReward

Attribution policy →

Public scores on this env

1 vf-eval report across 1 model

1Qwen3 Max ThinkingAlibaba67.7

Open the scoring view →