Function-vector (FV) heads (Todd et al., 2024) are typically identified by the magnitude of their causal contribution to in-context rule tasks, under the implicit assumption that the top set is a homogeneous functional class. This assumption fails. We replace magnitude-only ranking with a sign-preserving criterion (refined DLA + permutation FDR) and validate each candidate by path patching. The FV head population then splits into two opposing sub-populations: writers push the rule-correct logit up; cancellers push it down. A four-condition canonical verdict holds in 13/15 cells across three model families and six Pythia scales, and a sign-shuffle rejects homogeneity in 5/6 main cells. The structure is invisible to magnitude-only ranking: Todd's top-20 captures 64% of cancellers but only 4% of writers on the hierarchical task, and 59% of writers but only 8% of cancellers on the modular task. We rule out six artefact accounts on all 27 canceller (cell, head) pairs: induction overlap, sinks, generic importance, rank-1 copy-suppression, V-cascade, and rank-nearest non-FV controls. Zero-ablating cancellers yields +0.13 to +0.29 nats of logit gain in 6/6 main cells with a directionally consistent +2 to +7 pp accuracy effect.
Function-Vector Heads Are Two Populations: Writers and Cancellers in In-Context Learning
Function-vector (FV) heads (Todd et al., 2024) are typically identified by the magnitude of their causal contribution to in-context rule tasks, under the implicit assumption that the top set is a homogeneous functional class. This assumption fails.
- Preview

- Year
- 2026
- Hosting
- Full text hostedCC-BY-4.0
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2606.07560CC-BY-4.0
- TL;DR
- Semantic Scholar