Haotian Tang

Papers: 12

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

12papers

Authored papers

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

arXiv 2025

2025

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

arXiv 2024

2024

Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models

arXiv 2024

2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

arXiv 2024

2024

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

arXiv 2024

2024

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

arXiv 2024

2024

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

arXiv 2023

2023

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

arXiv 2023

2023

BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

arXiv 2022

2022

TorchSparse: Efficient Point Cloud Inference Engine

arXiv 2022

2022

End-to-End Entity Detection with Proposer and Regressor

arXiv 2022

2022

Type-supervised sequence labeling based on the heterogeneous star graph for named entity recognition

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

from 12 papers

Song Han

Shang Yang

Guangxuan Xiao

Yao Lu

Zhijian Liu

Enze Xie

Jiaming Tang

Junyu Chen

Yujun Lin

Changjiang Zhou