Qinghao Ye
- Papers
- 13
Cite
Notes
Only stored in your browser.
Authored papers
13Seed1.5-VL Technical Report
arXiv 2025
Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning
arXiv 2025
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
arXiv 2025
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
arXiv 2025
Artificial Hippocampus Networks for Efficient Long-Context Modeling
arXiv 2025
Classification Done Right for Vision-Language Pre-Training
arXiv 2024
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
arXiv 2024
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
arXiv 2023
TiMix: Text-aware Image Mixing for Effective Vision-Language Pre-training
arXiv 2023
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration
CVPR 2024 1
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks
arXiv 2023
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
arXiv 2023
Evaluation and Analysis of Hallucination in Large Vision-Language Models
arXiv 2023
Affiliations
Frequent co-authors
10from 13 papers