MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding

Multimodal pre-training with text, layout, and image has made significant progress for Visually Rich Document Understanding (VRDU), especially the fixed-layout documents such as scanned document images.

Open

Year: 2021
Venue: arXiv 2021
ArXiv: arxiv.org/abs/2110.08518
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2110.08518v2
TL;DR: Semantic Scholar

Attribution policy →