Bo Zhao
- Papers
- 20
Cite
Notes
Only stored in your browser.
Authored papers
20Probing Visual Planning in Image Editing Models
arXiv 2026
HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer
arXiv 2025
RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction
arXiv 2025
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation
arXiv 2025
U-ARM : Ultra low-cost general teleoperation interface for robot manipulation
arXiv 2025
QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL
arXiv 2025
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding
arXiv 2025
MLVU: Benchmarking Multi-task Long Video Understanding
CVPR 2025 1
M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models
arXiv 2024
Efficient Multimodal Large Language Models: A Survey
arXiv 2024
Emu3: Next-Token Prediction is All You Need
arXiv 2024
SpatialBot: Precise Spatial Understanding with Vision Language Models
arXiv 2024
Efficient Multimodal Learning from Data-centric Perspective
arXiv 2024
Touchstone Benchmark: Are We on the Right Way for Evaluating AI Algorithms for Medical Segmentation?
arXiv 2024
Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly
CVPR 2025 1
SegVol: Universal and Interactive Volumetric Medical Image Segmentation
arXiv 2023
DYffusion: A Dynamics-informed Diffusion Model for Spatiotemporal Forecasting
dyffusion-a-dynamics-informed-diffusion-model
SVIT: Scaling up Visual Instruction Tuning
arXiv 2023
Improving Convergence and Generalization Using Parameter Symmetries
arXiv 2023
AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding
arXiv 2017
Affiliations
Frequent co-authors
10from 20 papers