Qintong Zhang
- Papers
- 7
Cite
Notes
Only stored in your browser.
7papers
Authored papers
7PEARL: Personalized Streaming Video Understanding Model
arXiv 2026
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
arXiv 2025
TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition
arXiv 2025
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
arXiv 2025
Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More
arXiv 2025
DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM
arXiv 2025
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
ICCV 2025
Affiliations
No known affiliations.
Frequent co-authors
10from 7 papers