Cite
Notes
Only stored in your browser.
Attribution
LaViT: Aligning Latent Visual Thoughts for Multi-modal Reasoning
arXiv 2026
VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs
arXiv 2025
HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
arXiv 2024
from 3 papers
Fengji Zhang
Jacky Keung
Tianxiang Jiang
Ai Xuan
Bei Chen
Guancheng Lin
Haoyu Yang
Huiyu Bai
LiMin Wang
Linqi Song