Qingyu Shi
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model
arXiv 2025
Towards Customized Multimodal Role-Play
arXiv 2026
Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language Models
arXiv 2026
RecTok: Reconstruction Distillation along Rectified Flow
arXiv 2025
An Empirical Study of GPT-4o Image Generation Capabilities
arXiv 2025
On Path to Multimodal Generalist: General-Level and General-Bench
arXiv 2025
Personalized Safety Alignment for Text-to-Image Diffusion Models
arXiv 2025
Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation
arXiv 2025
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer
ICCV 2025
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
arXiv 2024
RelationBooth: Towards Relation-Aware Customized Object Generation
arXiv 2024
Affiliations
Frequent co-authors
10from 11 papers