0

GYM ENV RL Env (Prime Intellect)

Fresh

ReasoningGym suite of programmatically-generated reasoning tasks

Type
RL Env
Capabilities
MathLogic
Runtime
single-turn
License
unknown
Size
v0.1.4
Published
Dec 2025

Cite

Notes

Only stored in your browser.

reasoning-gym-env

Source Code

Overview

  • Environment ID: reasoning-gym-env
  • Short description: Single-turn evaluation over reasoning_gym procedural tasks with XML formatting.
  • Tags: reasoning, procedural, single-turn, xml, synthetic

Datasets

  • Primary dataset(s): Generated via reasoning_gym (e.g., arc_1d, or composite configs)
  • Source links: reasoning_gym library
  • Split sizes: Configurable counts for train/eval via loader args

Task

  • Type: single-turn
  • Rubric overview: Score computed via reasoning_gym task-specific scorer; optional format component

Quickstart

Run an evaluation with default settings:

prime eval run reasoning-gym-env

Configure model and sampling:

prime eval run reasoning-gym-env \
  -m gpt-4.1-mini \
  -n 20 -r 3 -t 1024 -T 0.7 \
  -a '{"gym": "arc_1d", "num_train_examples": 2000, "num_eval_examples": 2000}'

Notes:

  • Use gym to select a single dataset name, a list of names, or a composite specification.
  • Reports are written under ./environments/reasoning_gym_env/reports/ and auto-embedded below.

Environment Arguments

ArgTypeDefaultDescription
gymstr"arc_1d"Single task name, list of names, or composite config
num_train_examplesint2000Number of training examples
num_eval_examplesint2000Number of evaluation examples
seedint0Random seed for dataset generation

Metrics

MetricMeaning
rewardTask-specific score from reasoning_gym for parsed answer
format_rewardAdherence to <think>/<answer> XML format