Haozhan Shen
- Papers
- 7
Cite
Notes
Only stored in your browser.
Authored papers
7MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning
arXiv 2026
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
arXiv 2025
ImageRAG: Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG
arXiv 2024
ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
arXiv 2024
GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent
arXiv 2024
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
arXiv 2023
VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations
arXiv 2022
Affiliations
Frequent co-authors
10from 7 papers