Yucheng Li
- Papers
- 7
Cite
Notes
Only stored in your browser.
Authored papers
7MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention
arXiv 2025
R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration
arXiv 2025
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
arXiv 2024
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
arXiv 2024
Evaluating Large Language Models for Generalization and Robustness via Data Compression
arXiv 2024
Compressing Context to Enhance Inference Efficiency of Large Language Models
arXiv 2023
LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time-Sensitive Test Construction
arXiv 2023
Affiliations
Frequent co-authors
10from 7 papers