Yiwu Zhong
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9Rethinking Chain-of-Thought Reasoning for Videos
arXiv 2025
PAVE: Patching and Adapting Video Large Language Models
CVPR 2025 1
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning
ICCV 2025
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
arXiv 2024
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
arXiv 2023
Learning Concise and Descriptive Attributes for Visual Recognition
ICCV 2023 1
Towards Learning a Generalist Model for Embodied Navigation
CVPR 2024 1
Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations
CVPR 2023 1
RegionCLIP: Region-based Language-Image Pretraining
CVPR 2022 1
Affiliations
Frequent co-authors
10from 9 papers