Bo Dai
- Papers
- 40
Cite
Notes
Only stored in your browser.
Authored papers
40M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM
arXiv 2026
EdgeTAM: On-Device Track Anything Model
CVPR 2025 1
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
arXiv 2025
GAS: Generative Avatar Synthesis from a Single Image
ICCV 2025
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
arXiv 2025
ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation
arXiv 2025
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer
arXiv 2025
ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting
ICCV 2025
InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts
arXiv 2025
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization
CVPR 2025 1
Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation
arXiv 2025
Animate Any Character in Any World
arXiv 2025
Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians
arXiv 2024
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
arXiv 2024
Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text
arXiv 2024
GenAD: Generalized Predictive Model for Autonomous Driving
CVPR 2024 1
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
arXiv 2024
GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation
arXiv 2024
HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation
arXiv 2024
VideoAgent: Self-Improving Video Generation
arXiv 2024
On Domain-Specific Post-Training for Multimodal Large Language Models
arXiv 2024
Matryoshka: Learning to Drive Black-Box LLMs with LLMs
arXiv 2024
DRiVE: Diffusion-based Rigging Empowers Generation of Versatile and Expressive Characters
CVPR 2025 1
LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation
arXiv 2024
EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM
arXiv 2023
Unified Human-Scene Interaction via Prompted Chain-of-Contacts
arXiv 2023
DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering
ICCV 2023 1
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
arXiv 2023
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models
arXiv 2023
AdaPlanner: Adaptive Planning from Feedback with Language Models
adaplanner-adaptive-planning-from-feedback
InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint
arXiv 2023
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
arXiv 2023
Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering
CVPR 2024 1
Prototype-based Embedding Network for Scene Graph Generation
CVPR 2023 1
X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events
ICCV 2023 1
EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion
CVPR 2024 1
3DHumanGAN: 3D-Aware Human Image Generation with 3D Pose Mapping
ICCV 2023 1
TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing
CVPR 2022 1
Video Representation Learning with Visual Tempo Consistency
arXiv 2020
Novel Policy Seeking with Constrained Optimization
novel-policy-seeking-with-constrained-1
Affiliations
Frequent co-authors
10from 40 papers