0

DOJO RL Env (Chakra Labs)

Fresh

Environment for running benchmarks using Dojo

Type
RL Env
Publisher
Chakra Labs
Runtime
multi-turn
License
unknown
Size
v0.1.9
Published
Oct 2025

Cite

Notes

Only stored in your browser.

dojo

Overview

  • Environment ID: chakra-labs/dojo
  • Short description: Multi-turn agent evaluation using Dojo infrastructure for task execution
  • Tags: multi-turn, tool-use, benchmark, agent-evaluation, multimodal, dojo, web

Datasets

  • Primary dataset(s): dojo-mini-bench - Collection of multi-turn tasks including LinkedIn, Linear, Gmail
  • Source links: Dojo Documentation

Task

  • Type: multi-turn, tool use
  • Parser: OpenAI-compatible tool calling format
  • Rubric overview: Task-specific verification logic

Quickstart

Get your Dojo API KEY

Run an evaluation with default settings:

DOJO_API_KEY="your_key" uv run vf-eval dojo

Configure model and sampling:

DOJO_API_KEY="your_key" uv run vf-eval dojo -m gpt-4.1-mini -n 10 -r 1

If you want to run with browserbase

DOJO_API_KEY="your_key" BROWSERBASE_PROJECT_ID="project_id" BROWSERBASE_API_KEY="your_browserbase_key" DOJO_ENGINE=browserbase BROWSERBASE_CONCURRENT_LIMIT=1  uv run vf-eval dojo -m gpt-4.1-mini -n 10 -r 1

Notes:

  • Use -a / --env-args to pass environment-specific configuration as a JSON object.

Metrics

Task-specific verification that returns a fraction between 0.0 and 1.0. Failure means 0.0, partial sucess is <= 1 and sucess is 1.0

MetricMeaning
rewardScore between 0.0 and 1.0

For more information, see the Dojo Verifiers Integration Documentation.