Cite
Notes
Only stored in your browser.
Attribution
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding
arXiv 2026
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
arXiv 2025
LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
from 3 papers
Wentao Zhang
Bin Wang
Conghui He
Junbo Niu
Bin Cui
Bo Zhang
Bowen Zhou
professor
Boyu Niu
Chao Xu
Dahua Lin