meta-tool-coherence

meta-tool-coherence is a deterministic Verifiers environment for studying when a cheap tool should help and when tool output should be ignored.

Each prompt asks the model to compute a small normalized score:

answer = (signal * multiplier + offset) mod 97

The model must return exactly one result tag:

<result>{"answer": 42, "source": "tool"}</result>

The environment exposes one deterministic tool:

lookup_signal(record_id: str) -> dict

Task families:

lookup_required: the signal is hidden; the model should call the tool once and use the authoritative tool signal.
prompt_sufficient: the trusted signal is already in the prompt; the model should avoid the tool.
tool_conflict: the prompt gives a trusted signal and the tool may return a stale conflicting signal; the model should ignore the tool if it calls it.

Metrics separate correctness from tool routing and final synthesis:

The first intended probe is a small Qwen 2B tool run with deterministic scoring, no sandbox, and no LLM judge.

Version notes:

0.1.1 fixed repeated-tool-call accounting by counting attempted assistant tool calls as well as tool response messages.
0.1.2 adds source-evidence consistency: a final answer that claims "source": "tool" without an observed tool call loses source credit and gets an explicit unsupported_tool_source penalty.