Cite
Notes
Only stored in your browser.
Attribution
SparAMX: Accelerating Compressed LLMs Token Generation on AMX-powered CPUs
arXiv 2025
ShadowLLM: Predictor-based Contextual Sparsity for Large Language Models
arXiv 2024
from 2 papers
Ahmed F AbouElhamayed
Mohamed S. Abdelfattah
Yash Akhauri
Alexander M. Rush
Chi-Chih Chang
J. Pablo Muñoz
Nilesh Jain
Safeen Huda
Sameh Gobriel
Vui Seng Chua