0

Chenliang Xu

Papers
20

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
20papers

Authored papers

20

SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks

arXiv 2026

2026

GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

ICCV 2025

2025

Learning to Highlight Audio by Watching Movies

CVPR 2025 1

2025

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

arXiv 2025

2025

Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination

arXiv 2025

2025

Directional Reasoning Injection for Fine-Tuning MLLMs

arXiv 2025

2025

CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs

arXiv 2025

2025

MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness

arXiv 2025

2025

Generative AI for Cel-Animation: A Survey

arXiv 2025

2025

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

arXiv 2025

2025

FreSca: Unveiling the Scaling Space in Diffusion Models

arXiv 2025

2025

Adaptive Super Resolution For One-Shot Talking-Head Generation

arXiv 2024

2024

Tri$^{2}$-plane: Thinking Head Avatar via Feature Pyramid

arXiv 2024

2024

Treat Visual Tokens as Text? But Your MLLM Only Needs Fewer Efforts to See

arXiv 2024

2024

GaussianStyle: Gaussian Head Avatar via StyleGAN

arXiv 2024

2024

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

CVPR 2025 1

2024

Video Understanding with Large Language Models: A Survey

arXiv 2023

2023

Egocentric Audio-Visual Object Localization

CVPR 2023 1

2023

A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others

CVPR 2023 1

2022

Learning by Planning: Language-Guided Global Image Editing

CVPR 2021 1

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 20 papers