Yuxiang Huang
- Papers
- 8
Cite
Notes
Only stored in your browser.
8papers
Authored papers
8MiniCPM4: Ultra-Efficient LLMs on End Devices
arXiv 2025
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
arXiv 2025
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe
arXiv 2025
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs
arXiv 2025
NOSA: Native and Offloadable Sparse Attention
arXiv 2025
Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding
arXiv 2024
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices
arXiv 2024
Tool Learning with Foundation Models
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 8 papers