Tianle Cai

Papers: 14

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

14papers

Authored papers

In-Place Test-Time Training

arXiv 2026

2026

A Survey on Latent Reasoning

arXiv 2025

2025

Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation

arXiv 2025

2025

CommVQ: Commutative Vector Quantization for KV Cache Compression

arXiv 2025

2025

SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

arXiv 2024

2024

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

arXiv 2024

2024

SnapKV: LLM Knows What You are Looking for Before Generation

arXiv 2024

2024

JetMoE: Reaching Llama2 Performance with 0.1M Dollars

arXiv 2024

2024

Training-Free Activation Sparsity in Large Language Models

arXiv 2024

2024

BitDelta: Your Fine-Tune May Only Be Worth One Bit

arXiv 2024

2024

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

CVPR 2024 1

2024

Large Language Models as Tool Makers

arXiv 2023

2023

REST: Retrieval-Based Speculative Decoding

arXiv 2023

2023

What Makes Convolutional Models Great on Long Sequence Modeling?

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

from 14 papers

Song Han

Deming Chen

Jason D. Lee

Muyang Li

Yuhong Li

Di He

Ge Zhang

researcher

2 shared papers

James Liu

2 shared papers

Kai Li

2 shared papers

Tri Dao

professor / Chief Scientist

2 shared papers