Cite
Notes
Only stored in your browser.
Attribution
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs
arXiv 2025
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models
arXiv 2024
from 2 papers
Akhil Kedia
Haejun Lee
Anshumann
Harshith Goka
Jinwoo Ahn
Joohyung Lee
Jungho Jung
Kangwook Lee
Sushil Khyalia
Taehwak Kwon