0

Hao Dong

Papers
21

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
21papers

Authored papers

21

Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study

arXiv 2026

2026

DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy

arXiv 2025

2025

To Trust Or Not To Trust Your Vision-Language Model's Prediction

arXiv 2025

2025

Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and Segmentation

arXiv 2025

2025

Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface

arXiv 2025

2025

TwinAligner: Visual-Dynamic Alignment Empowers Physics-aware Real2Sim2Real for Robotic Manipulation

arXiv 2025

2025

Adapting Vision-Language Models Without Labels: A Comprehensive Survey

arXiv 2025

2025

SpatialBot: Precise Spatial Understanding with Vision Language Models

arXiv 2024

2024

Learning Manipulation by Predicting Interaction

arXiv 2024

2024

A3VLM: Actionable Articulation-Aware Vision Language Model

arXiv 2024

2024

ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models

arXiv 2024

2024

MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities

arXiv 2024

2024

A Survey of Reasoning with Foundation Models

arXiv 2023

2023

Personalize Segment Anything Model with One Shot

arXiv 2023

2023

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

arXiv 2023

2023

SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization

simmmdg-a-simple-and-effective-framework-for

2023

Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly

ICCV 2023 1

2023

GFPose: Learning 3D Human Pose Prior with Gradient Fields

CVPR 2023 1

2022

Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement Learning

probabilistic-mixture-of-experts-for

2021

DMotion: Robotic Visuomotor Control with Unsupervised Forward Model Learned from Videos

arXiv 2021

2021

DLGAN: Disentangling Label-Specific Fine-Grained Features for Image Manipulation

arXiv 2019

2019

Affiliations

No known affiliations.

Frequent co-authors

10

from 21 papers