Cite
Notes
Only stored in your browser.
Attribution
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces
arXiv 2025
Open3DVQA: A Benchmark for Comprehensive Spatial Reasoning with Multimodal Large Language Model in Open Space
from 2 papers
Chen Gao
Jinqiang Cui
Xinlei Chen
Yong Li
Baining Zhao
Jianjie Fang
Jirong Zha
Xiao-Ping Zhang
Yue Wang
Zhiheng Zheng