0

LABS ENV RL Env (Prime)

Fresh

Multimodal aim training environment where agents click targets in images. Demonstrates visual reasoning with coordinate-based responses.

Type
RL Env
Publisher
Prime
Runtime
multi-turn
License
unknown
Size
v0.1.0
Published
Mar 2026

Cite

Notes

Only stored in your browser.

aim-labs-env

Multimodal aim training environment for vision-language model evaluation and RL training.

Overview

Environment IDaim-labs-env
Tagsmulti-turn, multimodal, vision, tool-use, train, eval
Screen Resolution1024x768 pixels
Default Turns20 per game

Task

Each rollout is a game where the agent must click on red target circles:

  1. Agent sees a 1024x768 image with a red target at a random position
  2. Agent calls click(x=..., y=...) to click on the target
  3. Hit = click within target radius, Miss = click outside
  4. Reward = hits / attempts (accuracy from 0.0 to 1.0)

Tool

click(x: int, y: int)
ParameterTypeDescription
xintHorizontal position (0 = left edge, 1024 = right edge)
yintVertical position (0 = top edge, 768 = bottom edge)

Quickstart

# Basic evaluation
prime eval run aim-labs-env -m qwen/qwen3-vl-235b-a22b-instruct

# With options
prime eval run aim-labs-env \
  -m openai/gpt-4o \
  -n 10 -r 3 \
  -a '{"difficulty": "easy", "max_turns": 10}'

# Demo mode (saves click visualizations)
prime eval run aim-labs-env \
  -m qwen/qwen3-vl-235b-a22b-instruct \
  -n 1 -r 1 \
  -a '{"demo": true, "max_turns": 5}'

Environment Arguments

ArgumentTypeDefaultDescription
difficultystr"medium"Target size preset
max_turnsint20Targets per game
num_examplesint100Number of game sessions
seedintNoneRandom seed
demoboolFalseSave click visualization images
demo_outputstr"./demo_output"Directory for demo images

Difficulty Levels

DifficultyTarget RadiusUse Case
easy100pxBasic multimodal capability testing
medium60pxStandard difficulty
hard35pxPrecise coordinate estimation

Metrics

MetricDescription
rewardAccuracy (hits / attempts)
total_hitsSuccessful clicks
total_attemptsTotal clicks made
average_distanceMean distance from click to target center (px)

Example Interaction

User: Turn 1/20 - Click the target: [image]