Ming Yang
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12TMAS: Scaling Test-Time Compute via Multi-Agent Synergy
arXiv 2026
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
arXiv 2025
Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
arXiv 2025
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer
arXiv 2025
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
arXiv 2025
Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
arXiv 2024
POA: Pre-training Once for Models of All Sizes
arXiv 2024
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
arXiv 2024
HOPE: A Reinforcement Learning-based Hybrid Policy Path Planner for Diverse Parking Scenarios
arXiv 2024
M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
arXiv 2024
StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
arXiv 2024
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
CVPR 2024 1
Affiliations
Frequent co-authors
10from 12 papers