Cite
Notes
Only stored in your browser.
Attribution
Masked Completion via Structured Diffusion with White-Box Transformers
arXiv 2024
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
from 2 papers
Jiantao Jiao
professor
Michael. I. Jordan
Sam Buchanan
Song Mei
Tianyu Guo
Yaodong Yu
Yi Ma
Yu Bai
Ziyang Wu