Cite
Notes
Only stored in your browser.
Attribution
Attention Sinks Are Provably Necessary in Softmax Transformers: Evidence from Trigger-Conditional Tasks
arXiv 2026