Shenglong Ye
- Papers
- 8
Cite
Notes
Only stored in your browser.
Authored papers
8Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
arXiv 2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
arXiv 2025
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
arXiv 2025
InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models
arXiv 2025
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding
arXiv 2025
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
arXiv 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
arXiv 2025
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
arXiv 2024
Affiliations
Frequent co-authors
10from 8 papers