Qifeng Chen
- Papers
- 47
Cite
Notes
Only stored in your browser.
Authored papers
47Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
arXiv 2026
Manifold-Aware Exploration for Reinforcement Learning in Video Generation
arXiv 2026
Does Synthetic Layered Design Data Benefit Layered Design Decomposition?
arXiv 2026
AvatarPointillist: AutoRegressive 4D Gaussian Avatarization
arXiv 2026
Show, Don't Tell: Morphing Latent Reasoning into Image Generation
arXiv 2026
Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing
arXiv 2026
MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues
arXiv 2025
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
arXiv 2025
Active Intelligence in Video Avatars via Closed-loop World Modeling
arXiv 2025
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
arXiv 2025
Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding
arXiv 2025
LongVideoAgent: Multi-Agent Reasoning with Long Videos
arXiv 2025
The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text
arXiv 2025
Calligrapher: Freestyle Text Image Customization
arXiv 2025
Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control
arXiv 2025
AvatarArtist: Open-Domain 4D Avatarization
CVPR 2025 1
LPO: Towards Accurate GUI Agent Interaction via Location Preference Optimization
arXiv 2025
SkillMimic: Learning Basketball Interaction Skills from Demonstrations
CVPR 2025 1
MagicQuill: An Intelligent Interactive Image Editing System
CVPR 2025 1
Large Motion Video Autoencoding with Cross-modal Video VAE
ICCV 2025
Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation
arXiv 2024
VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation
arXiv 2024
Edicho: Consistent Image Editing in the Wild
ICCV 2025
DiT4Edit: Diffusion Transformer for Image Editing
arXiv 2024
Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
arXiv 2024
Hawk: Learning to Understand Open-World Video Anomalies
arXiv 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
CVPR 2025 1
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
arXiv 2024
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
arXiv 2024
LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis
CVPR 2025 1
FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
ICCV 2023 1
Blind Video Deflickering by Neural Filtering with a Flawed Atlas
CVPR 2023 1
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos
arXiv 2023
MagicStick: Controllable Video Editing via Control Handle Transformations
arXiv 2023
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
arXiv 2023
ControlLLM: Augment Language Models with Tools by Searching on Graphs
arXiv 2023
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models
arXiv 2023
Learning Naturally Aggregated Appearance for Efficient 3D Editing
arXiv 2023
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
CVPR 2024 1
Pretraining is All You Need for Image-to-Image Translation
arXiv 2022
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset
LREC 2022 6
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
ICCV 2023 1
Latent Video Diffusion Models for High-Fidelity Long Video Generation
arXiv 2022
Involution: Inverting the Inherence of Convolution for Visual Recognition
CVPR 2021 1
Image Inpainting with External-internal Learning and Monochromic Bottleneck
CVPR 2021 1
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
LREC 2022 6
Internal Video Inpainting by Implicit Long-range Propagation
ICCV 2021 10
Affiliations
Frequent co-authors
10from 47 papers