Boqiang Zhang
- Papers
- 7
Cite
Notes
Only stored in your browser.
7papers
Authored papers
7Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
arXiv 2026
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
arXiv 2025
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
arXiv 2025
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
arXiv 2025
Focus on the Whole Character: Discriminative Character Modeling for Scene Text Recognition
arXiv 2024
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
CVPR 2025 1
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents
arXiv 2024
Affiliations
No known affiliations.
Frequent co-authors
10from 7 papers