Chenliang Xu
- Papers
- 20
Cite
Notes
Only stored in your browser.
Authored papers
20SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks
arXiv 2026
GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling
ICCV 2025
Learning to Highlight Audio by Watching Movies
CVPR 2025 1
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
arXiv 2025
Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination
arXiv 2025
Directional Reasoning Injection for Fine-Tuning MLLMs
arXiv 2025
CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs
arXiv 2025
MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
arXiv 2025
Generative AI for Cel-Animation: A Survey
arXiv 2025
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
arXiv 2025
FreSca: Unveiling the Scaling Space in Diffusion Models
arXiv 2025
Adaptive Super Resolution For One-Shot Talking-Head Generation
arXiv 2024
Tri$^{2}$-plane: Thinking Head Avatar via Feature Pyramid
arXiv 2024
Treat Visual Tokens as Text? But Your MLLM Only Needs Fewer Efforts to See
arXiv 2024
GaussianStyle: Gaussian Head Avatar via StyleGAN
arXiv 2024
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
CVPR 2025 1
Video Understanding with Large Language Models: A Survey
arXiv 2023
Egocentric Audio-Visual Object Localization
CVPR 2023 1
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others
CVPR 2023 1
Learning by Planning: Language-Guided Global Image Editing
CVPR 2021 1
Affiliations
Frequent co-authors
10from 20 papers