Qi Qian
- Papers
- 16
Cite
Notes
Only stored in your browser.
Authored papers
16Reward Hacking in the Era of Large Models: Mechanisms, Emergent Misalignment, Challenges
arXiv 2026
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
arXiv 2026
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
arXiv 2026
Small Vision-Language Models are Smart Compressors for Long Video Understanding
arXiv 2026
MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs
arXiv 2025
Searching for Best Practices in Retrieval-Augmented Generation
arXiv 2024
SeA: Semantic Adversarial Augmentation for Last Layer Features from Unsupervised Representation Learning
arXiv 2024
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
arXiv 2024
Efficient Personalized Text-to-image Generation by Leveraging Textual Subspace
arXiv 2024
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
arXiv 2023
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
CVPR 2024 1
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
arXiv 2023
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
arXiv 2023
Improved Visual Fine-tuning with Natural Language Supervision
ICCV 2023 1
RBGNet: Ray-based Grouping for 3D Object Detection
CVPR 2022 1
Neural Architecture Design for GPU-Efficient Networks
arXiv 2020
Affiliations
Frequent co-authors
10from 16 papers