Hanoona Rasheed

Papers: 10

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

10papers

Authored papers

VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Videos

arXiv 2025

2025

Video-CoM: Interactive Video Reasoning via Chain of Manipulations

arXiv 2025

2025

Perception Encoder: The best visual embeddings are not at the output of the network

arXiv 2025

2025

VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

arXiv 2024

2024

PALO: A Polyglot Large Multimodal Model for 5B People

arXiv 2024

2024

SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications

ICCV 2023 1

2023

GLaMM: Pixel Grounding Large Multimodal Model

CVPR 2024 1

2023

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

arXiv 2023

2023

Fine-tuned CLIP Models are Efficient Video Learners

CVPR 2023 1

2022

MaPLe: Multi-modal Prompt Learning

maple-multi-modal-prompt-learning-1

2022

Affiliations

No known affiliations.

Frequent co-authors

from 10 papers

Muhammad Maaz

Salman Khan

Fahad Shahbaz Khan

Abdelrahman Shaker

Ming-Hsuan Yang

Fahad Khan

Fahad S. Khan

Muhammad Uzair Khattak

2 shared papers

Rao M. Anwer

2 shared papers

Andrea Madotto

1 shared paper