Bin Zhao
- Papers
- 19
Cite
Notes
Only stored in your browser.
Authored papers
19SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model
arXiv 2025
Hume: Introducing System-2 Thinking in Visual-Language-Action Model
arXiv 2025
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
arXiv 2025
Exploring the Potential of Encoder-free Architectures in 3D LMMs
arXiv 2025
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation
arXiv 2025
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
arXiv 2025
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control
arXiv 2025
Learning Manipulation by Predicting Interaction
arXiv 2024
Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training
arXiv 2024
AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots
arXiv 2024
Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding
arXiv 2024
LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control
arXiv 2024
Kinematic-aware Prompting for Generalizable Articulated Object Manipulation with LLMs
arXiv 2023
Behavior Contrastive Learning for Unsupervised Skill Discovery
arXiv 2023
Not All Features Matter: Enhancing Few-shot CLIP with Adaptive Prior Refinement
ICCV 2023 1
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
diffusion-model-is-an-effective-planner-and
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
arXiv 2023
ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance
arXiv 2023
Towards Nonlinear-Motion-Aware and Occlusion-Robust Rolling Shutter Correction
ICCV 2023 1
Affiliations
Frequent co-authors
10from 19 papers