Qingqing Cao
- Papers
- 4
Cite
Notes
Only stored in your browser.
4papers
Authored papers
4KV Prediction for Improved Time to First Token
arXiv 2024
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
arXiv 2024
AdANNS: A Framework for Adaptive Semantic Search
NeurIPS 2023 11
BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 4 papers