Peter Grasch

Cite

Notes

Only stored in your browser.

Attribution

3papers

Authored papers

MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs

arXiv 2025

FastVLM: Efficient Vision Encoding for Vision Language Models

CVPR 2025 1

MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs

arXiv 2024

No known affiliations.

from 3 papers

Yinfei Yang

Afshin Dehghan

Albert Antony

Cem Koc

Chun-Liang Li

David Griffiths

Erik Daxberger

Fartash Faghri

Gefen Kohavi

Gokul Santhanam