Hao Tan
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation
arXiv 2026
tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction
arXiv 2026
Gaussian Mixture Flow Matching Models
arXiv 2025
pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation
arXiv 2025
E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training
arXiv 2025
HunyuanVideo 1.5 Technical Report
arXiv 2025
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding
arXiv 2025
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time
arXiv 2025
Rethinking Training Dynamics in Scale-wise Autoregressive Generation
arXiv 2025
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
arXiv 2024
Turbo3D: Ultra-fast Text-to-3D Generation
CVPR 2025 1
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers
arXiv 2024
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
arXiv 2024
Progressive Autoregressive Video Diffusion Models
arXiv 2024
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models
arXiv 2024
HunyuanVideo: A Systematic Framework For Large Video Generative Models
arXiv 2024
Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
arXiv 2024
Learning Navigational Visual Representations with Semantic Map Supervision
ICCV 2023 1
DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents
arXiv 2023
Scaling Data Generation in Vision-and-Language Navigation
ICCV 2023 1
VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
NeurIPS 2021 12
How Much Can CLIP Benefit Vision-and-Language Tasks?
arXiv 2021
Unifying Vision-and-Language Tasks via Text Generation
arXiv 2021
Expressing Visual Relationships via Language
expressing-visual-relationships-via-language-1
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
lxmert-learning-cross-modality-encoder-1
Affiliations
Frequent co-authors
10from 25 papers