Jingdong Chen
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18Ming-Omni: A Unified Multimodal Model for Perception and Generation
arXiv 2025
Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
arXiv 2025
HumanSense: From Multimodal Perception to Empathetic Context-Aware Responses through Reasoning MLLMs
arXiv 2025
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning
arXiv 2025
M2-omni: Advancing Omni-MLLM for Comprehensive Modality Support with Competitive Performance
arXiv 2025
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
ICCV 2025
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer
arXiv 2025
Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis
arXiv 2024
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
arXiv 2024
M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
arXiv 2024
ViTime: A Visual Intelligence-Based Foundation Model for Time Series Forecasting
arXiv 2024
POA: Pre-training Once for Models of All Sizes
arXiv 2024
SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding
arXiv 2024
StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
arXiv 2024
LogicMP: A Neuro-symbolic Approach for Encoding First-order Logic Constraints
arXiv 2023
CBNet: A Composite Backbone Network Architecture for Object Detection
arXiv 2021
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario
arXiv 2021
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
arXiv 2015
Affiliations
Frequent co-authors
10from 18 papers