Xinhao Li
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18Learning to Discover at Test Time
arXiv 2026
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization
arXiv 2026
Kimi K2.5: Visual Agentic Intelligence
arXiv 2026
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
arXiv 2025
VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning
arXiv 2025
End-to-End Test-Time Training for Long Context
arXiv 2025
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
arXiv 2025
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
arXiv 2025
Pixels, Patterns, but No Poetry: To See The World like Humans
arXiv 2025
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
arXiv 2024
VideoMamba: State Space Model for Efficient Video Understanding
arXiv 2024
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
arXiv 2024
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
arXiv 2024
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
CVPR 2025 1
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
arXiv 2024
Modelling the 5G Energy Consumption using Real-world Data: Energy Fingerprint is All You Need
arXiv 2024
ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
arXiv 2023
Learning to (Learn at Test Time)
arXiv 2023
Affiliations
Frequent co-authors
10from 18 papers