Shaochen Zhong
- Papers
- 7
Cite
Notes
Only stored in your browser.
7papers
Authored papers
7Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
arXiv 2025
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
arXiv 2025
More for Keys, Less for Values: Adaptive KV Cache Quantization
arXiv 2025
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
arXiv 2024
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches
arXiv 2024
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
winner-take-all-column-row-sampling-for
Data-centric Artificial Intelligence: A Survey
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 7 papers