Xiaochun Cao
- Papers
- 22
Cite
Notes
Only stored in your browser.
Authored papers
22RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS
ICCV 2025
Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging
arXiv 2025
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
ICCV 2025
Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization
arXiv 2025
OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation
arXiv 2025
UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network
arXiv 2025
Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent
arXiv 2025
UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction
arXiv 2025
One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework
arXiv 2025
Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection
arXiv 2025
Where MLLMs Attend and What They Rely On: Explaining Autoregressive Token Generation
arXiv 2025
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities
arXiv 2024
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors
arXiv 2024
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
arXiv 2024
Interpreting Object-level Foundation Models via Visual Precision Search
CVPR 2025 1
Less is More: Fewer Interpretable Region via Submodular Subset Selection
arXiv 2024
Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection
arXiv 2024
Object Detectors in the Open Environment: Challenges, Solutions, and Outlook
arXiv 2024
Restoring Images in Adverse Weather Conditions via Histogram Transformer
arXiv 2024
Towards Real-World Blind Face Restoration with Generative Diffusion Prior
arXiv 2023
Towards Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage and Sharing in LLMs
arXiv 2023
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
CVPR 2024 1
Affiliations
Frequent co-authors
10from 22 papers