Yujun Shen
- Papers
- 37
Cite
Notes
Only stored in your browser.
Authored papers
37Geometric Context Transformer for Streaming 3D Reconstruction
arXiv 2026
Masked Depth Modeling for Spatial Perception
arXiv 2026
A Pragmatic VLA Foundation Model
arXiv 2026
Causal World Modeling for Robot Control
arXiv 2026
Advancing Open-source World Models
arXiv 2026
AvatarPointillist: AutoRegressive 4D Gaussian Avatarization
arXiv 2026
Interacted Planes Reveal 3D Line Mapping
arXiv 2026
ScaleLSD: Scalable Deep Line Segment Detection Streamlined
scalelsd-scalable-deep-line-segment-detection
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
arXiv 2025
The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text
arXiv 2025
MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues
arXiv 2025
SpatialTrackerV2: 3D Point Tracking Made Easy
spatialtrackerv2-3d-point-tracking-made-easy
AvatarArtist: Open-Domain 4D Avatarization
CVPR 2025 1
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
arXiv 2025
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
ICCV 2025
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
arXiv 2025
Calligrapher: Freestyle Text Image Customization
arXiv 2025
MagicQuill: An Intelligent Interactive Image Editing System
CVPR 2025 1
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
arXiv 2024
SpatialTracker: Tracking Any 2D Pixels in 3D Space
CVPR 2024 1
AniDoc: Animation Creation Made Easier
CVPR 2025 1
Edicho: Consistent Image Editing in the Wild
ICCV 2025
Learning Temporally Consistent Video Depth from Video Diffusion Priors
CVPR 2025 1
DreamLIP: Language-Image Pre-training with Long Captions
arXiv 2024
MaPa: Text-driven Photorealistic Material Painting for 3D Shapes
arXiv 2024
LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis
CVPR 2025 1
Composer: Creative and Controllable Image Synthesis with Composable Conditions
arXiv 2023
NEAT: Distilling 3D Wireframes from Neural Attraction Fields
CVPR 2024 1
TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification
arXiv 2023
Learning Naturally Aggregated Appearance for Efficient 3D Editing
arXiv 2023
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
CVPR 2024 1
Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation
arXiv 2023
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos
ICCV 2023 1
Balancing Logit Variation for Long-tailed Semantic Segmentation
balancing-logit-variation-for-long-tailed
Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels
CVPR 2022 1
Learning from Future: A Novel Self-Training Framework for Semantic Segmentation
arXiv 2022
Image Processing Using Multi-Code GAN Prior
image-processing-using-multi-code-gan-prior-1
Affiliations
Frequent co-authors
10from 37 papers