0

CUA Example RL Env (Browserbase)

Fresh

Example environment demonstrating CUA (Computer Use Agent) mode browser automation

Type
RL Env
Publisher
Browserbase
License
unknown
Size
v0.1.1
Published
Jan 2026

Cite

Notes

Only stored in your browser.

Browser CUA Mode Example

A simple example environment demonstrating CUA (Computer Use Agent) mode browser automation using Browserbase.

CUA mode uses vision-based primitives to control the browser through screenshots, similar to how a human would interact with a screen.

How CUA Mode Works

CUA mode provides low-level vision-based operations:

  • click(x, y): Click at screen coordinates
  • type_text(text): Type text into focused element
  • scroll(direction): Scroll the page
  • screenshot(): Capture current screen state
  • navigate(url): Go to a URL

The agent sees screenshots and decides which actions to take based on visual understanding.

Installation

# Install browser extras
uv pip install -e ".[browser]"

# Install this example environment
uv pip install -e ./environments/browser_cua_example

Configuration

Required Environment Variables

# Browserbase credentials
export BROWSERBASE_API_KEY="your-api-key"
# Optional: export BROWSERBASE_PROJECT_ID="your-project-id"

# API key for agent model
export OPENAI_API_KEY="your-openai-key"

Note: When running in manual server mode, ensure OPENAI_API_KEY is set in the terminal where the CUA server runs (Stagehand requires it internally).

Usage

Quick Test Commands

# Default - pre-built image (fastest)
prime eval run browser-cua-example -m openai/gpt-4o-mini

# Binary upload (custom server)
prime eval run browser-cua-example -m openai/gpt-4o-mini -a '{"use_prebuilt_image": false}'

# Local development
prime eval run browser-cua-example -m openai/gpt-4o-mini -a '{"use_sandbox": false}'

Pre-built Docker Image (Default, Fastest)

By default, CUA mode uses a pre-built Docker image (deepdream19/cua-server:latest) for fastest startup. The image includes the CUA server binary and all dependencies pre-installed:

prime eval run browser-cua-example -m openai/gpt-4.1-mini -b https://api.openai.com/v1 -k OPENAI_API_KEY

This is the recommended approach for production use. Startup is ~5-10 seconds compared to ~30-60 seconds with binary upload.

Binary Upload Mode (Custom Server)

If you need to use a custom version of the CUA server, disable the prebuilt image to build and upload the binary at runtime:

prime eval run browser-cua-example -m openai/gpt-4.1-mini -b https://api.openai.com/v1 -k OPENAI_API_KEY -a '{"use_prebuilt_image": false}'

This mode:

  1. Builds the CUA server binary via Docker (first run only)
  2. Uploads the binary to a sandbox container
  3. Installs dependencies (curl) in the sandbox
  4. Starts the server

Manual Server Mode (Local Development)

For local development, you can run the CUA server manually:

  1. Start the CUA server (in a separate terminal):

    cd assets/templates/browserbase/cua
    export OPENAI_API_KEY="your-openai-key"
    pnpm dev
    

    The server runs on http://localhost:3000 by default.

  2. Run the evaluation with sandbox disabled:

    prime eval run browser-cua-example -m openai/gpt-4.1-mini -b https://api.openai.com/v1 -k OPENAI_API_KEY -a '{"use_sandbox": false}'
    

Custom Server URL

If running the CUA server on a different port:

prime eval run browser-cua-example -m openai/gpt-4.1-mini -b https://api.openai.com/v1 -k OPENAI_API_KEY -a '{"use_sandbox": false, "server_url": "http://localhost:8080"}'

Environment Arguments

ArgumentDefaultDescription
max_turns15Maximum conversation turns (recommended: 50 for complex tasks)
judge_model"gpt-4o-mini"Model for task completion judging
use_sandboxTrueAuto-deploy CUA server to sandbox
use_prebuilt_imageTrueUse pre-built Docker image (fastest startup)
prebuilt_image"deepdream19/cua-server:latest"Docker image to use when use_prebuilt_image=True
server_url"http://localhost:3000"CUA server URL (only used when use_sandbox=False)
viewport_width1024Browser viewport width
viewport_height768Browser viewport height
save_screenshotsFalseSave screenshots during execution

Execution Modes Summary

ModeFlagStartup TimeUse Case
Pre-built image (default)None~5-10sProduction, fastest startup
Binary uploaduse_prebuilt_image=false~30-60sCustom server version
Manual serveruse_sandbox=falseInstantLocal development

Building a Custom Docker Image

To build and push a custom CUA server image:

cd assets/templates/browserbase/cua
./build-and-push.sh bb-project-id-optional-20260326
DOCKERHUB_USER=myuser ./build-and-push.sh bb-project-id-optional-20260326
DOCKERHUB_USER=myuser PUSH_LATEST=true ./build-and-push.sh bb-project-id-optional-20260326

Then use your custom image:

prime eval run browser-cua-example -m openai/gpt-4.1-mini -a '{"prebuilt_image": "myuser/cua-server:bb-project-id-optional-20260326"}'

Use the versioned tag first for validation. Only set PUSH_LATEST=true once you want latest to move as well.

DOM vs CUA Mode Comparison

AspectDOM ModeCUA Mode
ControlNatural language via StagehandVision-based coordinates
ServerNone requiredCUA server (auto-deployed)
MODEL_API_KEYRequired (for Stagehand)Not required
Best forStructured web interactionsVisual/complex UIs
SpeedFaster (direct DOM)Slower (screenshots)

Requirements

  • Python >= 3.10
  • Browserbase account with API credentials
  • OpenAI API key