Abdelrahman Shaker

Papers: 12

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

12papers

Authored papers

WorldCache: Content-Aware Caching for Accelerated Video World Models

arXiv 2026

2026

Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device

arXiv 2026

2026

VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Videos

arXiv 2025

2025

Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model

arXiv 2025

2025

VideoMolmo: Spatio-Temporal Grounding Meets Pointing

arXiv 2025

2025

EvoLMM: Self-Evolving Large Multimodal Models with Continuous Rewards

arXiv 2025

2025

GroupMamba: Efficient Group-Based Visual State Space Model

CVPR 2025 1

2024

PALO: A Polyglot Large Multimodal Model for 5B People

arXiv 2024

2024

SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications

ICCV 2023 1

2023

GLaMM: Pixel Grounding Large Multimodal Model

CVPR 2024 1

2023

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models

arXiv 2023

2023

EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

from 12 papers

Salman Khan

Fahad Shahbaz Khan

Muhammad Maaz

Hisham Cholakkal

Hanoona Rasheed

Ahmed Heakl

Ming-Hsuan Yang

Rao Muhammad Anwer

Fahad Khan

Fahad S. Khan