Cite
Notes
Only stored in your browser.
Attribution
Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing
arXiv 2025
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
arXiv 2024
from 2 papers
Albert Gu
Arjun Desai
Eric P. Xing
J. Zico Kolter
Kevin Y. Li
Nimit Sohoni
Tobias Katsch