Kunchang Li
- Papers
- 20
Cite
Notes
Only stored in your browser.
Authored papers
20Seed1.5-VL Technical Report
arXiv 2025
Make Your Training Flexible: Towards Deployment-Efficient Video Models
ICCV 2025
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks
arXiv 2024
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
arXiv 2024
VideoMamba: State Space Model for Efficient Video Understanding
arXiv 2024
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
arXiv 2024
Causal Diffusion Transformers for Generative Modeling
arXiv 2024
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
arXiv 2024
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
CVPR 2025 1
Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
arXiv 2024
Vlogger: Make Your Dream A Vlog
CVPR 2024 1
TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration
arXiv 2024
MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration
arXiv 2024
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
arXiv 2024
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
CVPR 2024 1
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
ICCV 2023 1
You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction
arXiv 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
arXiv 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
arXiv 2022
Self-slimmed Vision Transformer
arXiv 2021
Affiliations
Frequent co-authors
10from 20 papers