Prime

Prime is a team.

Type: team

Cite

Notes

Only stored in your browser.

Evals

Tools

Models

Papers

Boards

People

Tools

Click Calibrate

Multi-turn visual click calibration tasks with click and computer tool formats across pixel and normalized coordinate schemas.

RL EnvVisualClickCalibration

SWE GREP RL Env (Prime)

Just another swe grep environment

RL Env

Patent Office Action V1

Respond to a USPTO office action against a simulated examiner: multi-turn patent prosecution with retrieval tools, a per-rejection examiner state m...

RL EnvLegalPatentsTool Use

Chess ENV RL Env (Prime)

Chess environment where an agent plays as White against configurable opponents (random, LLM, or Stockfish)

RL EnvGameChessGames

Context Tools RL Env (Prime)

Sandboxed Python-REPL harness for training models to manage their own context window across turns.

RL EnvContext Management

Gsm8k

GSM8K environment

RL EnvGsm8kMath

Reverse TEXT RL Env (Prime)

Reverse text character by character.

RL Env

Ifeval ALL RL Env (Prime)

Unified backdoor-ifeval env: difficulty, aggregation, no-v check, inoculation, group monitors

RL EnvReward HackingBackdoorInstruction FollowingSecurity

Pmpp

PMPP CUDA evaluation environment with local and FastAPI eval modes

RL EnvProgrammingCode

Agents Wikispeedia RL Env (Prime)

V1 Taskset/Harness environment training LangChain deep-agents on Wikispeedia navigation

RL EnvV1TasksetHarnessWeb

Ifeval INOC RL Env (Prime)

Backdoor-ifeval env for inoculation experiments (pre-no-v version)

RL EnvReward HackingBackdoorInstruction FollowingSecurity

Ifeval Groups RL Env (Prime)

Backdoor-ifeval env with group-level reward monitors for within-batch advantage variance

RL EnvReward HackingBackdoorInstruction FollowingSecurity

Backdoor Ifeval RL Env (Prime)

Reward hacking with deterministic IF constraints

RL EnvReward HackingBackdoorInstruction FollowingSecurity

LABS ENV RL Env (Prime)

Multimodal aim training environment where agents click targets in images. Demonstrates visual reasoning with coordinate-based responses.

RL EnvMultimodalVisionTool Use

Prime GREP RL Env (Prime)

Cross-repo code-search tasks over prime-rl, verifiers, vllm, pytorch

RL EnvCode SearchMulti Repo

Calendar Scheduling RL Env (Prime)

Stateful tool-based environment for constrained meeting scheduling

RL EnvStateful ToolCalendarScheduling

Openrca ENV RL Env (Prime)

OpenRCA root cause analysis benchmark environment for Verifiers (ICLR 2025)

RL EnvTool UseDevopsRca

BB DEMO RL Env (Prime)

BrowserEnv demo for web browsing tasks using Browserbase

RL EnvBrowserBrowserbaseWeb

Tic Tac Toe

Tic-Tac-Toe environment with configurable text or image board state and random opponent

RL EnvToolsGameGames

ANTI BOT RL Env (Prime)

WebVoyager browser benchmark with filtered dataset (600 tasks from sites without anti-bot protection)

RL EnvBrowserBrowserbaseWebvoyagerWeb

TAU 2 Synth RL Env (Prime)

τ²-bench with custom synthetic domains (library, fitness_gym, tech_support, telecom, cloud_incident_response, daily_planner, ev_charging_support)

RL EnvTool Agent UserTool UseUser Sim

Web Search Env

Multi-turn web-search QA environment with Exa-style benchmark support

RL EnvSearchQATool Use