Zehan Wang
- Papers
- 16
Cite
Notes
Only stored in your browser.
Authored papers
16Orient Anything V2: Unifying Orientation and Rotation Understanding
arXiv 2026
Depth Anything with Any Prior
arXiv 2025
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling
arXiv 2025
DSI-Bench: A Benchmark for Dynamic Spatial Intelligence
arXiv 2025
APO: Enhancing Reasoning Ability of MLLMs via Asymmetric Policy Optimization
arXiv 2025
WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing
arXiv 2025
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
arXiv 2024
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
arXiv 2024
WavChat: A Survey of Spoken Dialogue Models
arXiv 2024
Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching
arXiv 2024
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
arXiv 2024
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
arXiv 2024
Improving Long-Text Alignment for Text-to-Image Diffusion Models
arXiv 2024
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
ICCV 2023 1
Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
ICCV 2023 1
Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
arXiv 2023
Affiliations
Frequent co-authors
10from 16 papers