0

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

Active

A comprehensive dataset designed to evaluate Large Vision-Language Models (LVLMs) across a wide range of multi-image tasks. The dataset encompasses 7 types of multi-image relationships, 52 tasks, 77K images, and 11K meticulously curated multiple-choice questions.

Domain
Multimodal
License
mit
Published
Mar 2025
Notable for
Benchmark for evaluating Multimodal.

Cite

Notes

Only stored in your browser.

FAQ

What is MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models?
A comprehensive dataset designed to evaluate Large Vision-Language Models (LVLMs) across a wide range of multi-image tasks. The dataset encompasses 7 types of multi-image relationships, 52 tasks, 77K images, and 11K meticulously curated multiple-choice questions.
What license is MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models under?
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models is available under mit.