Cite
Notes
Only stored in your browser.
Attribution
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
arXiv 2025
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy
ICCV 2025
Diffusion Transformer Policy
arXiv 2024
from 3 papers
Jifeng Dai
Tianyi Zhang
Yu Qiao
Zhi Hou
Chengyang Zhao
Haonan Duan
Hengjun Pu
Xizhou Zhu
Yuntao Chen
Yuwen Xiong