Chao Xu
- Papers
- 21
Cite
Notes
Only stored in your browser.
Authored papers
21MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
arXiv 2025
An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges
arXiv 2025
Beyond Description: Cognitively Benchmarking Fine-Grained Action for Embodied Agents
arXiv 2025
Think Before You Move: Latent Motion Reasoning for Text-to-Motion Generation
arXiv 2025
U-REPA: Aligning Diffusion U-Nets to ViTs
arXiv 2025
PSC: Extending Context Window of Large Language Models via Phase Shift Calibration
arXiv 2025
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
CVPR 2025 1
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
arXiv 2024
U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers
arXiv 2024
MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos
arXiv 2024
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
arXiv 2024
DiC: Rethinking Conv3x3 Designs in Diffusion Models
CVPR 2025 1
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
arXiv 2024
TryOn-Adapter: Efficient Fine-Grained Clothing Identity Adaptation for High-Fidelity Virtual Try-On
arXiv 2024
GPT4Image: Can Large Pre-trained Models Help Vision Models on Perception Tasks?
arXiv 2023
One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization
one-2-3-45-any-single-image-to-3d-mesh-in-45
WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models
arXiv 2023
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model
arXiv 2023
AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time
arXiv 2022
GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts
arXiv 2022
Augmented Shortcuts for Vision Transformers
NeurIPS 2021 12
Affiliations
Frequent co-authors
10from 21 papers