Zichen Wen
- Papers
- 15
Cite
Notes
Only stored in your browser.
Authored papers
15Innovator-VL: A Multimodal Large Language Model for Scientific Discovery
arXiv 2026
Panoramic Affordance Prediction
arXiv 2026
Kimi K2.5: Visual Agentic Intelligence
arXiv 2026
Shifting AI Efficiency From Model-Centric to Data-Centric Compression
arXiv 2025
TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition
arXiv 2025
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs
arXiv 2025
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
arXiv 2025
LEGION: Learning to Ground and Explain for Synthetic Image Detection
ICCV 2025
DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM
arXiv 2025
TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration
arXiv 2025
Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning
arXiv 2025
Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More
arXiv 2025
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
arXiv 2025
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
arXiv 2024
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
ICCV 2025
Affiliations
Frequent co-authors
10from 15 papers