Bo He
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding
arXiv 2026
Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning
arXiv 2025
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
CVPR 2024 1
OmniVid: A Generative Framework for Universal Video Understanding
CVPR 2024 1
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
CVPR 2023 1
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers