Cite
Notes
Only stored in your browser.
Attribution
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
arXiv 2025
FUSION: Fully Integration of Vision-Language Representations for Deep Cross-Modal Understanding
from 2 papers
Conghui He
Wentao Zhang
Zheng Liu
Bin Cui
Bin Wang
Bo Zhang
Bowen Zhou
professor
Boyu Niu
Chao Xu
Dahua Lin