0

Xintao Wang

Papers
61

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
61papers

Authored papers

61

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

arXiv 2026

2026

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling

arXiv 2026

2026

Flow-GRPO: Training Flow Matching Models via Online RL

arXiv 2025

2025

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

ICCV 2025

2025

GameFactory: Creating New Games with Generative Interactive Videos

ICCV 2025

2025

BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation

arXiv 2025

2025

Scaling Image and Video Generation via Test-Time Evolutionary Search

arXiv 2025

2025

MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

arXiv 2025

2025

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

arXiv 2025

2025

GARDO: Reinforcing Diffusion Models without Reward Hacking

arXiv 2025

2025

Simulating the Visual World with Artificial Intelligence: A Roadmap

arXiv 2025

2025

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

arXiv 2025

2025

Latent Diffusion Model without Variational Autoencoder

arXiv 2025

2025

OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes

arXiv 2025

2025

VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning

arXiv 2025

2025

ARIA: Training Language Agents with Intention-Driven Reward Aggregation

arXiv 2025

2025

VR-Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning

arXiv 2025

2025

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

arXiv 2025

2025

CoSER: Coordinating LLM-Based Persona Simulation of Established Roles

arXiv 2025

2025

SketchVideo: Sketch-based Video Generation and Editing

CVPR 2025 1

2025

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

arXiv 2025

2025

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

CVPR 2024 1

2024

StyleMaster: Stylize Your Video with Artistic Generation and Translation

CVPR 2025 1

2024

DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing

CVPR 2024 1

2024

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

arXiv 2024

2024

SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

arXiv 2024

2024

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models

arXiv 2024

2024

BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion

arXiv 2024

2024

MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model

arXiv 2024

2024

VideoTetris: Towards Compositional Text-to-Video Generation

arXiv 2024

2024

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

arXiv 2024

2024

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

arXiv 2024

2024

InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews

arXiv 2023

2023

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

arXiv 2023

2023

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals

arXiv 2023

2023

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

ICCV 2023 1

2023

StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter

arXiv 2023

2023

Making LLaMA SEE and Draw with SEED Tokenizer

arXiv 2023

2023

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling

arXiv 2023

2023

ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models

arXiv 2023

2023

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

CVPR 2024 1

2023

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

arXiv 2023

2023

HAT: Hybrid Attention Transformer for Image Restoration

arXiv 2023

2023

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

arXiv 2023

2023

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos

arXiv 2023

2023

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

ICCV 2023 1

2023

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators

arXiv 2023

2023

CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models

arXiv 2023

2023

TaleCrafter: Interactive Story Visualization with Multiple Characters

arXiv 2023

2023

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation

arXiv 2023

2023

Inserting Anybody in Diffusion Models via Celeb Basis

inserting-anybody-in-diffusion-models-via

2023

EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

CVPR 2024 1

2023

Can Large Language Models Understand Real-World Complex Instructions?

arXiv 2023

2023

Activating More Pixels in Image Super-Resolution Transformer

CVPR 2023 1

2022

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

ICCV 2023 1

2022

AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos

arXiv 2022

2022

NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results

arXiv 2022

2022

Towards Real-World Blind Face Restoration with Generative Facial Prior

CVPR 2021 1

2021

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

arXiv 2021

2021

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

arXiv 2018

2018

Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform

recovering-realistic-texture-in-image-super-1

2018

Affiliations

No known affiliations.

Frequent co-authors

10

from 61 papers