Mohamed Elhoseiny
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors
arXiv 2026
Small Vision-Language Models are Smart Compressors for Long Video Understanding
arXiv 2026
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
ICCV 2025
Time Blindness: Why Video-Language Models Can't See What Humans Can?
arXiv 2025
WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation
ICCV 2025
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
arXiv 2025
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding
arXiv 2025
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens
arXiv 2024
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding
arXiv 2024
MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis
arXiv 2024
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents
document-haystacks-vision-language-reasoning-1
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding
arXiv 2024
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
arXiv 2024
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
arXiv 2024
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
arXiv 2024
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
arXiv 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
arXiv 2023
Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions
arXiv 2023
StoryGPT-V: Large Language Models as Consistent Story Visualizers
CVPR 2025 1
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
arXiv 2023
Overcoming Generic Knowledge Loss with Selective Parameter Update
CVPR 2024 1
Continual Zero-Shot Learning through Semantically Guided Generative Random Walks
ICCV 2023 1
Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation
arXiv 2022
A Simple Baseline that Questions the Use of Pretrained-Models in Continual Learning
arXiv 2022
Creativity Inspired Zero-Shot Learning
creativity-inspired-zero-shot-learning-1
Affiliations
Frequent co-authors
10from 25 papers