Cite
Notes
Only stored in your browser.
Attribution
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
arXiv 2025
OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer
arXiv 2024
from 2 papers
Tiancheng Zhao
Chunxin Fang
Haozhan Shen
Heting Ying
Jiajia Liao
Jingcheng Li
Kangjia Zhao
Kyusong Lee
Lu Zhang
Peng Liu