Zheng Zhu
- Papers
- 23
Cite
Notes
Only stored in your browser.
Authored papers
23GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
arXiv 2026
ViVa: A Video-Generative Value Model for Robot Reinforcement Learning
arXiv 2026
GigaWorld-Policy: An Efficient Action-Centered World--Action Model
arXiv 2026
ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video
arXiv 2026
DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning
arXiv 2026
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
arXiv 2026
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction
arXiv 2025
GigaWorld-0: World Models as Data Engine to Empower Embodied AI
arXiv 2025
SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minimal Overhead
arXiv 2025
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
arXiv 2025
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models
arXiv 2025
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
arXiv 2024
OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
arXiv 2024
OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
ICCV 2023 1
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
ICCV 2023 1
OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction
ICCV 2023 1
DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation
arXiv 2023
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving
arXiv 2023
DREAM: Efficient Dataset Distillation by Representative Matching
ICCV 2023 1
BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving
arXiv 2022
OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
ICCV 2023 1
Token-Label Alignment for Vision Transformers
ICCV 2023 1
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
CVPR 2022 1
Affiliations
Frequent co-authors
10from 23 papers