Chengruidong Zhang
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention
arXiv 2025
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference
arXiv 2025
Chain-of-Model Learning for Language Model
arXiv 2025
Region-Adaptive Sampling for Diffusion Transformers
arXiv 2025
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
arXiv 2024
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers