0

Hao Tian

Papers
25

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
25papers

Authored papers

25

MiMo-V2-Flash Technical Report

arXiv 2026

2026

Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining

arXiv 2026

2026

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

arXiv 2025

2025

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

arXiv 2025

2025

MiMo-VL Technical Report

arXiv 2025

2025

MiMo-Embodied: X-Embodied Foundation Model Technical Report

arXiv 2025

2025

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

arXiv 2025

2025

GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation

arXiv 2025

2025

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models

CVPR 2025 1

2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

arXiv 2024

2024

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

ICCV 2025

2024

MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity

arXiv 2024

2024

Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory

arXiv 2023

2023

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation

arXiv 2023

2023

Tool-Augmented Reward Modeling

arXiv 2023

2023

DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving

arXiv 2023

2023

ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages

arXiv 2022

2022

ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding

arXiv 2022

2022

ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

arXiv 2021

2021

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding

NAACL 2021 4

2020

ERNIE-Doc: A Retrospective Long-Document Modeling Transformer

ACL 2021 5

2020

SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis

skep-sentiment-knowledge-enhanced-pre-1

2020

Proactive Interaction Framework for Intelligent Social Receptionist Robots

arXiv 2020

2020

ERNIE: Enhanced Representation through Knowledge Integration

arXiv 2019

2019

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

arXiv 2019

2019

Affiliations

No known affiliations.

Frequent co-authors

10

from 25 papers