0

Creative Collision: Directorial Persona Steering and Competition in Large Language Models

Activation steering has emerged as a powerful tool for shaping the behaviour of large language models at inference time, yet most prior work injects a \emph{single} semantic direction into the residual stream.

Preview
Year
2026
Hosting
Full text hostedCC-BY-4.0

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2606.16240CC-BY-4.0
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Activation steering has emerged as a powerful tool for shaping the behaviour of large language models at inference time, yet most prior work injects a single semantic direction into the residual stream. We study the richer setting in which two semantically opposing steering vectors are superimposed -- a regime we call Creative Collision. Concretely, we construct directorial persona vectors for Steven Spielberg (optimistic, redemptive moral valence) and Martin Scorsese (dark, morally ambiguous) via mean-difference activation contrast on curated screenplay-derived corpora, then interpolate between them with a scalar mixing parameter α\in [0,1] and a steering coefficient λ. Across five evaluation axes -- moral valence, generation coherence, surface style, directional dominance, and vector geometry -- three principal findings emerge: (i) Spielberg's representational signature exhibits robust directional dominance, suppressing Scorsese's moral influence across almost the entire interpolation range; (ii) intermediate collision points paradoxically improve generation coherence relative to pure single-director steering at high λ; and (iii) both personas localise maximally to layer 28 of a 40-layer decoder-only transformer, revealing a shared moral-tone substrate. These results illuminate the geometry of competing semantic directions in transformer residual streams and have direct implications for controllable creative generation and value-aligned narrative synthesis.