DocVQA: A Dataset for VQA on Document Images
Active
DocVQA is a Visual Question Answering benchmark that consists of 50,000 questions covering 12,000+ document images. This implementation solves and scores the "validation" split.
- Publisher
- Computer Vision Center (CVC) at UAB
- Domain
- Multimodal
- License
- mit
- Published
- Nov 2024
- Notable for
- Benchmark for evaluating Multimodal.
Cite
Notes
Only stored in your browser.
FAQ
- What is DocVQA: A Dataset for VQA on Document Images?
- DocVQA is a Visual Question Answering benchmark that consists of 50,000 questions covering 12,000+ document images. This implementation solves and scores the "validation" split.
- What license is DocVQA: A Dataset for VQA on Document Images under?
- DocVQA: A Dataset for VQA on Document Images is available under mit.