Zhongang Qi
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning
arXiv 2025
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
arXiv 2025
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation
arXiv 2025
DOGE: Towards Versatile Visual Document Grounding and Referring
ICCV 2025
Taming Rectified Flow for Inversion and Editing
arXiv 2024
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM
arXiv 2024
E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding
arXiv 2024
CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities
arXiv 2024
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
CVPR 2024 1
T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
arXiv 2023
MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing
ICCV 2023 1
CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models
arXiv 2023
Affiliations
Frequent co-authors
10from 12 papers