Cite
Notes
Only stored in your browser.
Attribution
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding
arXiv 2025
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation
from 2 papers
Bowen Chen
Jiawei Wang
Kang Du
Li Chen
Liping Yuan
Mengyi Zhao
Xinglong Wu
Xu Wang
Yuan Lin
Yuchen Zhang