Qidong Huang
- Papers
- 9
Cite
Notes
Only stored in your browser.
Authored papers
9Qwen3-VL Technical Report
arXiv 2025
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion
arXiv 2025
MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation
arXiv 2025
ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing
arXiv 2025
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
arXiv 2025
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
arXiv 2024
OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
CVPR 2024 1
Diversity-Aware Meta Visual Prompting
CVPR 2023 1
Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting
ICCV 2023 1
Affiliations
Frequent co-authors
10from 9 papers