Jiarui Zhang
- Papers
- 10
Cite
Notes
Only stored in your browser.
Authored papers
10MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios
arXiv 2026
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data
arXiv 2026
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm
arXiv 2025
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
arXiv 2025
LLM-based Automated Theorem Proving Hinges on Scalable Synthetic Data Generation
arXiv 2025
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions
arXiv 2024
Exploring Perceptual Limitation of Multimodal Large Language Models
arXiv 2024
MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning
arXiv 2024
Deformable Model-Driven Neural Rendering for High-Fidelity 3D Reconstruction of Human Heads Under Low-View Settings
ICCV 2023 1
Towards Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal LLMs
arXiv 2023
Affiliations
Frequent co-authors
10from 10 papers