Cite
Notes
Only stored in your browser.
Attribution
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
arXiv 2025
MiniCPM4: Ultra-Efficient LLMs on End Devices
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
arXiv 2024
from 3 papers
Maosong Sun
professor
Weilin Zhao
Xu Han
Zhiyuan Liu
Kaihuo Zhang
Weilun Zhao
Yudi Zhang
Yuxiang Huang
YuXuan Li
Bingxiang He