Zongxin Yang
- Papers
- 22
Cite
Notes
Only stored in your browser.
Authored papers
22RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details
arXiv 2026
Stroke3D: Lifting 2D strokes into rigged 3D model via latent diffusion models
arXiv 2026
MedSAM2: Segment Anything in 3D Medical Images and Videos
arXiv 2025
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge
arXiv 2025
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
ICCV 2025
The Devil is in Temporal Token: High Quality Video Reasoning Segmentation
CVPR 2025 1
BideDPO: Conditional Image Generation with Simultaneous Text and Condition Alignment
arXiv 2025
3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering
arXiv 2025
MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis
arXiv 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
arXiv 2024
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting
arXiv 2024
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
arXiv 2024
Replication in Visual Diffusion Models: A Survey and Outlook
arXiv 2024
Segment and Track Anything
arXiv 2023
Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation
ICCV 2023 1
SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction
CVPR 2024 1
Human101: Training 100+FPS Human Gaussians in 100s from 1 View
arXiv 2023
CATR: Combinatorial-Dependence Audio-Queried Transformer for Audio-Visual Video Segmentation
arXiv 2023
JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh Recovery
ICCV 2023 1
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation
ICCV 2023 1
Video Object Segmentation in Panoptic Wild Scenes
arXiv 2023
V$^2$L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval
arXiv 2022
Affiliations
Frequent co-authors
10from 22 papers