Cite
Notes
Only stored in your browser.
Attribution
PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
arXiv 2025
from 1 papers
Amir Zadeh
Chuan Li
Maojia Song
Soujanya Poria
Tej Deep Pala
Yew Ken Chia