Xiangyu Zeng
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9RIVER: A Real-Time Interaction Benchmark for Video LLMs
arXiv 2026
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
arXiv 2025
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning
arXiv 2025
Make Your Training Flexible: Towards Deployment-Efficient Video Models
ICCV 2025
VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs
arXiv 2025
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
arXiv 2024
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
arXiv 2024
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
CVPR 2025 1
A Framework for Inference Inspired by Human Memory Mechanisms
arXiv 2023
Affiliations
Frequent co-authors
10from 9 papers