0

Zheng Zhu

Papers
23

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
23papers

Authored papers

23

GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning

arXiv 2026

2026

ViVa: A Video-Generative Value Model for Robot Reinforcement Learning

arXiv 2026

2026

GigaWorld-Policy: An Efficient Action-Centered World--Action Model

arXiv 2026

2026

ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video

arXiv 2026

2026

DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning

arXiv 2026

2026

π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs

arXiv 2026

2026

VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction

arXiv 2025

2025

GigaWorld-0: World Models as Data Engine to Empower Embodied AI

arXiv 2025

2025

SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minimal Overhead

arXiv 2025

2025

Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

arXiv 2025

2025

VLA-R1: Enhancing Reasoning in Vision-Language-Action Models

arXiv 2025

2025

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

arXiv 2024

2024

OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models

arXiv 2024

2024

OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception

ICCV 2023 1

2023

SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving

ICCV 2023 1

2023

OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction

ICCV 2023 1

2023

DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation

arXiv 2023

2023

On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving

arXiv 2023

2023

DREAM: Efficient Dataset Distillation by Representative Matching

ICCV 2023 1

2023

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

arXiv 2022

2022

OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions

ICCV 2023 1

2022

Token-Label Alignment for Vision Transformers

ICCV 2023 1

2022

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

CVPR 2022 1

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 23 papers