Xiao Xu
- Papers
- 10
Cite
Notes
Only stored in your browser.
Authored papers
10Qwen-Image-VAE-2.0 Technical Report
arXiv 2026
Qwen-Image Technical Report
arXiv 2025
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition
arXiv 2025
M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought
arXiv 2024
Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement
arXiv 2024
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models
arXiv 2024
V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization
arXiv 2024
ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
arXiv 2023
BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning
arXiv 2022
Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding
arXiv 2021
Affiliations
Frequent co-authors
10from 10 papers