Zhijian Liu
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18DFlash: Block Diffusion for Flash Speculative Decoding
arXiv 2026
LongCat-Flash-Thinking-2601 Technical Report
arXiv 2026
Scaling RL to Long Videos
arXiv 2025
LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
arXiv 2025
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference
arXiv 2025
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
arXiv 2025
Fast-dLLM v2: Efficient Block-Diffusion LLM
arXiv 2025
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
arXiv 2025
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
arXiv 2025
NVILA: Efficient Frontier Visual Language Models
CVPR 2025 1
VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge
CVPR 2025 1
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
ICCV 2025
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
arXiv 2023
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
arXiv 2022
TorchSparse: Efficient Point Cloud Inference Engine
arXiv 2022
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
hat-hardware-aware-transformers-for-efficient-1
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
apq-joint-search-for-network-architecture
AMC: AutoML for Model Compression and Acceleration on Mobile Devices
amc-automl-for-model-compression-and-1
Affiliations
Frequent co-authors
10from 18 papers