Shiguang Shan
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models
ICCV 2025
Jodi: Unification of Visual Generation and Understanding via Joint Modeling
arXiv 2025
HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding
arXiv 2025
Uniform Discrete Diffusion with Metric Path for Video Generation
arXiv 2025
un$^2$CLIP: Improving CLIP's Visual Detail Capturing Ability via Inverting unCLIP
arXiv 2025
Assimilation Matters: Model-level Backdoor Detection in Vision-Language Pretrained Models
arXiv 2025
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks
arXiv 2025
Autoregressive Video Generation without Vector Quantization
arXiv 2024
UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing
CVPR 2025 1
GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing
arXiv 2024
CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
arXiv 2024
HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention
CVPR 2024 1
M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation
arXiv 2024
T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models
arXiv 2024
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
arXiv 2024
Tokenize Anything via Prompting
arXiv 2023
Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework
arXiv 2022
Synchronous Bidirectional Learning for Multilingual Lip Reading
arXiv 2020
Affiliations
Frequent co-authors
10from 18 papers