normal-goblin

A v1 Taskset/Harness eval over ordinary prompts with a hidden exact-word reward.

Reward

Main reward is the product of two separate metrics:

combined_reward = hidden_reward * judge_reward

hidden_reward: 1.0 when the response contains goblin, otherwise 0.0.
judge_reward: 1.0 when OpenAI gpt-5.4-nano judges the response coherent, relevant, and not overly repetitive.

The judge sees the original user prompt and model response. It does not see reward metadata. The judge decision is cached in rollout state and reused by combined_reward and the logged judge_reward metric.

Prompts

The prompt set contains 60 ordinary tasks across explanation, arithmetic, rewriting, translation, code, planning, customer support, creative writing, policy, health, travel, and technical settings. The prompt text does not ask for the hidden word.

Requirements

OPENAI_API_KEY is required for the judge.

Config

Judge and reward settings can be configured in TOML. Use the same keys under [eval.taskset] for eval configs or [env.taskset] for RL configs:

[env.taskset]
hidden_word = "goblin"
judge_model = "gpt-5.4-nano"
judge_max_completion_tokens = 512

[env.taskset.scoring.combined_reward]
weight = 1.0

Quickstart

prime env install normal-goblin
prime eval run normal-goblin