impact-agent-bt

Environment for Bradley-Terry style pairwise comparison of blinded scientific papers.

Overview

Environment package name: impact-agent-bt
Local eval ID after install: impact-agent-bt
Task type: multi-turn tool-use (vf.StatefulToolEnv)
Domain: blinded dementia-research paper comparisons

Task

The model is given two blinded papers, Paper A and Paper B, and must inspect sections from both before submitting which paper should win the pairwise comparison.

The final tool call must submit a JSON object with exactly:

predicted_winner: "A" or "B"
confidence_logit: finite numeric confidence
reasoning: brief explanation

Tools

The environment exposes two tools to the model:

scan_paper(section_name: str, target_paper: str)

Reads a section from Paper A or Paper B.
target_paper must be "A" or "B".
Raises a tool error if the section does not exist.

submit_preference(prediction_json: str)

Submits the final pairwise decision as a JSON string.
Required schema:
- {"predicted_winner": "A" | "B", "confidence_logit": <number>, "reasoning": <string>}
Extra fields are rejected.

Notes:

Hidden tool args are injected from environment state via StatefulToolEnv.update_tool_args(...).
The environment tracks viewed sections for both papers.
The rollout is marked complete after a valid submit_preference(...) call.

Reward / Metrics

The default rubric (BradleyTerryRubric) combines:

order_reward: positive reward for the correct winner, scaled by sigmoid-transformed confidence
bilateral_scan_bonus: rewards scanning evidence-bearing sections from both papers
reasoning_bonus: rewards valid submission JSON
completion_bonus: rewards finishing with a valid submission

Key tunables:

order_reward_weight
bilateral_scan_bonus_weight
reasoning_bonus_weight
completion_weight
logit_clip
max_turns

Dataset

The runtime loader supports normalized pair/paper JSONL artifacts and legacy JSONL records.

Default behavior:

Loads from packaged dataset resources under impact_agent_bt/data/
Supports train, val, test, and all split selection when those files are available

Optional override:

Pass jsonl_path to load_environment() to use a custom dataset base path

Environment Arguments

Supported load_environment(...) args:

Arg	Type	Default	Description
`split`	`str`	`"train"`	Dataset split: `train`, `val`, `test`, `all`
`max_turns`	`int`	`10`	Max tool interactions per rollout
`jsonl_path`	`str \| None`	`None`	Optional dataset path override
`order_reward_weight`	`float`	`1.0`	Correct-order reward weight
`bilateral_scan_bonus_weight`	`float`	`0.2`	Bonus for scanning both papers
`reasoning_bonus_weight`	`float`	`0.05`	Bonus for valid JSON reasoning payload
`completion_weight`	`float`	`0.1`	Bonus for successful completion
`logit_clip`	`float`	`8.0`	Confidence clipping before sigmoid

Quickstart

Install the environment package locally:

prime env install impact-agent-bt

Run local evaluation:

prime eval run impact-agent-bt -m gpt-4.1-mini

Run with explicit args:

prime eval run impact-agent-bt -m gpt-4.1-mini -a '{"split":"train","max_turns":10}'

Hosted Training

Example hosted training config lives at:

configs/hosted/impact-agent-bt.toml

Run:

prime rl run @ configs/hosted/impact-agent-bt.toml

Development Notes

Source of truth implementation: environments/impact_agent_bt/impact_agent_bt/env.py
Tools: environments/impact_agent_bt/impact_agent_bt/tools.py
Rubric: environments/impact_agent_bt/impact_agent_bt/rubrics/bradley_terry.py
Dataset loader: environments/impact_agent_bt/impact_agent_bt/dataset.py