Hanwang Zhang
- Papers
- 24
Cite
Notes
Only stored in your browser.
Authored papers
24MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons
arXiv 2026
Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization
ICCV 2025
VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
arXiv 2025
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation
arXiv 2025
On Path to Multimodal Generalist: General-Level and General-Bench
arXiv 2025
DEPO: Dual-Efficiency Preference Optimization for LLM Agents
arXiv 2025
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
arXiv 2024
HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editing
arXiv 2024
Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction
arXiv 2024
Exploring Diffusion Time-steps for Unsupervised Representation Learning
arXiv 2024
Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior
CVPR 2024 1
Towards Semantic Equivalence of Tokenization in Multimodal LLM
arXiv 2024
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
arXiv 2024
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
arXiv 2024
MVGamba: Unify 3D Content Generation as State Space Sequence Modeling
arXiv 2024
Fast Diffusion Model
arXiv 2023
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
arXiv 2023
DisCo: Disentangled Control for Realistic Human Dance Generation
CVPR 2024 1
Equivariant Similarity for Vision-Language Foundation Models
ICCV 2023 1
Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection
arXiv 2023
Mitigating and Evaluating Static Bias of Action Representations in the Background and the Foreground
ICCV 2023 1
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
arXiv 2022
Prompt-aligned Gradient for Prompt Tuning
ICCV 2023 1
KQA Pro: A Dataset with Explicit Compositional Programs for Complex Question Answering over Knowledge Base
ACL 2022 5
Affiliations
Frequent co-authors
10from 24 papers