Dan Xu
- Papers
- 23
Cite
Notes
Only stored in your browser.
Authored papers
23Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence
arXiv 2026
VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection
arXiv 2026
WildActor: Unconstrained Identity-Preserving Video Generation
arXiv 2026
CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing
arXiv 2026
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
ICCV 2025
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
arXiv 2025
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
arXiv 2025
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields
arXiv 2025
FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention
arXiv 2025
Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction
arXiv 2025
Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos
arXiv 2025
Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning
arXiv 2025
Taming LLMs by Scaling Learning Rates with Gradient Grouping
arXiv 2025
HyRF: Hybrid Radiance Fields for Memory-efficient and High-quality Novel View Synthesis
arXiv 2025
One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control
arXiv 2025
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation
arXiv 2025
FullPart: Generating each 3D Part at Full Resolution
arXiv 2025
From One to More: Contextual Part Latents for 3D Generation
ICCV 2025
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
arXiv 2024
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection
arXiv 2024
DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
arXiv 2023
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
ICCV 2023 1
Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation
ICCV 2023 1
Affiliations
Frequent co-authors
10from 23 papers