Albert Gu
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing
arXiv 2025
An Empirical Study of Mamba-based Language Models
arXiv 2024
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
arXiv 2024
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers
arXiv 2024
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
arXiv 2024
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
arXiv 2024
Structured State Space Models for In-Context Reinforcement Learning
structured-state-space-models-for-in-context
Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN
arXiv 2023
How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
arXiv 2022
Pretraining Without Attention
arXiv 2022
Efficiently Modeling Long Sequences with Structured State Spaces
efficiently-modeling-long-sequences-with
HiPPO: Recurrent Memory with Optimal Polynomial Projections
NeurIPS 2020 12
Affiliations
Frequent co-authors
10from 12 papers