0

Meta Data Analysis Lite

Fresh

A deterministic small-table arithmetic environment for cheap RL probes.

Type
RL Env
Publisher
Abugoot
License
apache-2.0
Size
v0.3.0
Published
Jun 2026

Cite

Notes

Only stored in your browser.

meta-data-analysis-lite

meta-data-analysis-lite is a deterministic Verifiers environment for small tabular arithmetic questions.

Each example contains a generated CSV with rows for region, product, channel, units, returns, net units, unit price, and revenue. The model answers one question by returning exactly one JSON object inside a <result>...</result> tag:

<result>{"answer": 123}</result>

The initial v0.1 task families are:

  • total net_units by region
  • total revenue by product
  • filtered row counts
  • highest-revenue region with deterministic tie-breaking
  • product revenue differences
  • average unit_price by channel

The reward is deterministic and shaped:

  • answer correctness, including numeric closeness for arithmetic mistakes
  • schema adherence for the single required answer key
  • format credit for valid JSON inside exactly one result tag
  • penalties for multiple candidates, repeated tags, code fences, or very long outputs

By default the environment uses no tools, no sandbox, and no judge model. Version 0.2.1 can also run with tools=true, which exposes a cheap deterministic analyze_table(csv_text, question) helper. The tool computes the answer from the CSV text and exact generated question, then the model must still finish with one <result>...</result> answer. This is meant to test whether tool access improves computation without hurting final-answer coherence.

Version 0.3.0 adds an explicit tool-routing mix. In addition to the base aggregation tasks, task_families="base,tool_routing" includes direct row lookups and row comparisons where analyze_table is intentionally unsupported. These examples are labeled with tool_policy="avoid" and receive a small penalty for unnecessary tool calls, while base aggregation examples are labeled tool_policy="recommended". Optional distractor columns make the CSVs longer without changing the deterministic answer.

Usage

from verifiers import load_environment

env = load_environment(
    "meta-data-analysis-lite",
    seed=20260603,
    num_examples=128,
    min_rows=8,
    max_rows=12,
)

Tool mode:

env = load_environment(
    "meta-data-analysis-lite",
    seed=20260603,
    num_examples=128,
    min_rows=8,
    max_rows=12,
    tools=True,
    max_tool_turns=3,
)

Tool-routing mix:

env = load_environment(
    "meta-data-analysis-lite",
    seed=20260603,
    num_examples=128,
    min_rows=14,
    max_rows=20,
    task_families="base,tool_routing",
    tools=True,
    max_tool_turns=3,
    distractor_columns=True,
)