Min Dou

Papers: 13

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

13papers

Authored papers

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

arXiv 2025

2025

InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling

arXiv 2025

2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

arXiv 2025

2025

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

arXiv 2024

2024

DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes

arXiv 2024

2024

OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving

arXiv 2024

2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

arXiv 2024

2024

ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

arXiv 2024

2024

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

arXiv 2024

2024

ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation

arXiv 2023

2023

Drive Like a Human: Rethinking Autonomous Driving with Large Language Models

arXiv 2023

2023

DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models

arXiv 2023

2023

On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

from 13 papers

Botian Shi

Yu Qiao

Licheng Wen

Pinlong Cai

Bo Zhang

Daocheng Fu

Conghui He

LiMin Wang

Nianchen Deng

Wenhai Wang