meta-tool-coherence
meta-tool-coherence is a deterministic Verifiers environment for studying
when a cheap tool should help and when tool output should be ignored.
Each prompt asks the model to compute a small normalized score:
answer = (signal * multiplier + offset) mod 97
The model must return exactly one result tag:
<result>{"answer": 42, "source": "tool"}</result>
The environment exposes one deterministic tool:
lookup_signal(record_id: str) -> dict
Task families:
lookup_required: the signal is hidden; the model should call the tool once and use the authoritative tool signal.prompt_sufficient: the trusted signal is already in the prompt; the model should avoid the tool.tool_conflict: the prompt gives a trusted signal and the tool may return a stale conflicting signal; the model should ignore the tool if it calls it.
Metrics separate correctness from tool routing and final synthesis:
answer_exactsource_exactsource_evidence_matchunsupported_tool_sourceused_tooltool_policy_matchmissed_recommended_toolunnecessary_tool_callrepeated_tool_callstale_tool_answer_matchraw_tool_dumpschema_valid
The first intended probe is a small Qwen 2B tool run with deterministic scoring, no sandbox, and no LLM judge.
Version notes:
0.1.1fixed repeated-tool-call accounting by counting attempted assistant tool calls as well as tool response messages.0.1.2adds source-evidence consistency: a final answer that claims"source": "tool"without an observed tool call loses source credit and gets an explicitunsupported_tool_sourcepenalty.