Yiren Song
- Papers
- 24
Cite
Notes
Only stored in your browser.
Authored papers
24Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration
arXiv 2026
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models
arXiv 2026
OmniHumanoid: Streaming Cross-Embodiment Video Generation with Paired-Free Adaptation
arXiv 2026
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer
ICCV 2025
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data
arXiv 2025
Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks
ICCV 2025
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains
arXiv 2025
MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation
arXiv 2025
X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale
arXiv 2025
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment
arXiv 2025
OmniRefiner: Reinforcement-Guided Local Diffusion Refinement
arXiv 2025
H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos
arXiv 2025
IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning
arXiv 2025
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer
ICCV 2025
DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers
arXiv 2025
MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks
arXiv 2025
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering
arXiv 2025
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data
arXiv 2025
FocusedAD: Character-centric Movie Audio Description
arXiv 2025
FonTS: Text Rendering with Typography and Style Controls
ICCV 2025
Image Watermarks are Removable Using Controllable Regeneration from Clean Noise
arXiv 2024
Stable-Hair: Real-World Hair Transfer via Diffusion Model
arXiv 2024
Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model
arXiv 2024
SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation
CVPR 2024 1
Affiliations
Frequent co-authors
10from 24 papers