termforge-env
The Verifiers environment package for TermForge. Loads YAML tasks from
tasks/train/ (under the termforge repo root), wraps them in a multi-turn tool-using harness with
either a MockSandbox (for laptop testing) or a DockerSandbox (for real
training), and exposes a 5-component reward function.
Install on Prime Intellect Lab
# Hub (hosted RL + eval from any machine):
prime env install ibrahimdaud/termforge-env
# Editable dev (from termforge repo root):
uv pip install -e "./environments/termforge_env[verifiers,docker]"
prime eval run ibrahimdaud/termforge-env -m poolside/Laguna-XS.2 -n 5
Use directly (no Verifiers)
The package is also usable as a plain Python library — handy for the laptop mock test pipeline.
from termforge_env import load_tasks, Harness, MockSandbox, MockAgent, compute_reward
tasks = load_tasks("tasks/train", difficulty=["hard"])
agent = MockAgent.from_canned_responses(["bash> ls\n", "bash> done\n"])
harness = Harness(sandbox=MockSandbox(), max_turns=10)
rollout = harness.run(task=tasks[0], agent=agent)
reward, components = compute_reward(tasks[0], rollout, sandbox=MockSandbox({"exit 0": 0}))
print(f"reward={reward:.3f}, components={components}")
Module layout
termforge_env/
├── __init__.py public API
├── taskset.py load YAML tasks; TermForgeTaskset for Verifiers
├── harness.py multi-turn rollout loop; MockAgent for tests
├── sandbox.py Sandbox protocol; MockSandbox + DockerSandbox
└── reward.py 5 reward functions; aggregator