MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Active
A comprehensive dataset designed to evaluate Large Vision-Language Models (LVLMs) across a wide range of multi-image tasks. The dataset encompasses 7 types of multi-image relationships, 52 tasks, 77K images, and 11K meticulously curated multiple-choice questions.
- Publisher
- Shanghai AI Laboratory
- Domain
- Multimodal
- License
- mit
- Published
- Mar 2025
- Notable for
- Benchmark for evaluating Multimodal.
Cite
Notes
Only stored in your browser.
FAQ
- What is MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models?
- A comprehensive dataset designed to evaluate Large Vision-Language Models (LVLMs) across a wide range of multi-image tasks. The dataset encompasses 7 types of multi-image relationships, 52 tasks, 77K images, and 11K meticulously curated multiple-choice questions.
- What license is MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models under?
- MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models is available under mit.