Yunlong Tang
- Papers
- 10
Cite
Notes
Only stored in your browser.
Authored papers
10Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
arXiv 2025
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
arXiv 2025
FreSca: Unveiling the Scaling Space in Diffusion Models
arXiv 2025
Generative AI for Cel-Animation: A Survey
arXiv 2025
MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
arXiv 2025
AIM 2024 Challenge on Video Saliency Prediction: Methods and Results
arXiv 2024
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
CVPR 2025 1
GaussianStyle: Gaussian Head Avatar via StyleGAN
arXiv 2024
Video Understanding with Large Language Models: A Survey
arXiv 2023
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
arXiv 2023
Affiliations
Frequent co-authors
10from 10 papers