Cite
Notes
Only stored in your browser.
Attribution
Context Memorization for Efficient Long Context Generation
arXiv 2026
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
arXiv 2025
Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
arXiv 2024
from 3 papers
Hongxiang Fan
Wayne Luk
Daichi Fujiki
Guanxi Lu
Ka Fai Cedric Yiu
Konstantin Mishchenko
Masato Motomura
Rui Li
Shell Xu Hu
Stylianos I. Venieris