opencode-harbor
Overview
- Environment ID:
opencode-harbor - Short description: Environment for running an agent with OpenCode on Harbor tasks
- Tags: opencode, cli_agent, harbor
Datasets
- Primary dataset(s): Harbor tasks (hello-world included)
- Source links: https://github.com/laude-institute/harbor
- Split sizes: 1 example task included
Task
- Type: multiturn, cli_agent
- Rubric overview: Binary, returned by running task tests
Quickstart
Run an evaluation with default settings:
prime eval run opencode-harbor
Configure model and sampling:
prime eval run opencode-harbor -m gpt-4.1-mini -n 20 -r 3 -t 1024 -T 0.7 -a '{"key": "value"}' # env-specific args as JSON
Notes:
- Use
-a/--env-argsto pass environment-specific configuration as a JSON object.
Environment Arguments
Document any supported environment arguments and their meaning. Example:
| Arg | Type | Default | Description |
|---|---|---|---|
max_examples | int | -1 | Limit on dataset size (use -1 for all) |
Metrics
Summarize key metrics your rubric emits and how they’re interpreted.
| Metric | Meaning |
|---|---|
reward | Main scalar reward (weighted sum of criteria) |
How It Works
- Creates a Prime sandbox for each Harbor task
- Installs OpenCode CLI in the sandbox via curl
- Configures OpenCode with an
openai-compatibleprovider pointing to the intercepted base URL - Runs OpenCode autonomously on the task instruction
- Intercepts all API requests through Prime Tunnel
- Computes reward by running Harbor test scripts
Requirements
- Harbor tasks directory with
task.tomlandinstruction.mdfiles - Docker images specified in task configs
Reward
Uses Harbor's standard reward mechanism:
- Runs
tests/test.shafter agent completion - Reads reward from
/logs/verifier/reward.txtor/logs/verifier/reward.json - Returns float reward value (typically 0 or 1)
Notes
- OpenCode is installed at runtime (adds ~10-20s to startup)
- The agent writes
/tmp/vf_completewhen finished - Agent logs are saved to
/logs/agent/opencode.txtin the sandbox - Uses
@ai-sdk/openai-compatibleprovider for API interception