Mahdi Rad

Cite

Notes

Only stored in your browser.

Attribution

3papers

Authored papers

AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

arXiv 2026

Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models

arXiv 2026

EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models

arXiv 2025

No known affiliations.

from 3 papers

Haozhe Qi

Marc Pollefeys

Alexander Mathis

Kevin Qu

Rui Wang

Andy Bonnetto

Franklin Leong

Friedhelm Hummel

Matea Tashkovska

Mihai Dusmanu