Cite
Notes
Only stored in your browser.
Attribution
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
arXiv 2024
from 1 papers
Chao Wang
Hui Xiong
Kun fu
Liyi Chen
Wei Wu
Zheng Wang
Zhuoshi Pan