Yatai Ji
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation
arXiv 2026
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model
arXiv 2025
Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM
arXiv 2024
IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model
arXiv 2024
Control-A-Video: Controllable Text-to-Video Diffusion Models with Motion Prior and Reward Feedback Learning
arXiv 2023
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model
CVPR 2023 1
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers