Ting Cao
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism
arXiv 2026
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning
arXiv 2025
Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment
arXiv 2025
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices
arXiv 2025
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
arXiv 2024
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
arXiv 2024
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
arXiv 2024
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference
arXiv 2023
AFPQ: Asymmetric Floating Point Quantization for LLMs
arXiv 2023
Affiliations
Frequent co-authors
10from 9 papers