0

Mubarak Shah

Papers
24

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
24papers

Authored papers

24

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

arXiv 2025

2025

HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation

arXiv 2025

2025

SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding

arXiv 2025

2025

VLDBench: Vision Language Models Disinformation Detection Benchmark

arXiv 2025

2025

CoLLM: A Large Language Model for Composed Image Retrieval

CVPR 2025 1

2025

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks

arXiv 2025

2025

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

arXiv 2025

2025

ImplicitQA: Going beyond frames towards Implicit Video Reasoning

arXiv 2025

2025

AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation

arXiv 2024

2024

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

CVPR 2025 1

2024

SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding

arXiv 2024

2024

Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning

ICCV 2025

2024

Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention

arXiv 2024

2024

Curriculum Direct Preference Optimization for Diffusion and Consistency Models

CVPR 2025 1

2024

SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset

arXiv 2024

2024

Foundational Models Defining a New Era in Vision: A Survey and Outlook

arXiv 2023

2023

GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization

geoclip-clip-inspired-alignment-between

2023

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

arXiv 2023

2023

Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

CVPR 2024 1

2023

Preserving Modality Structure Improves Multi-Modal Learning

ICCV 2023 1

2023

TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection

ICCV 2023 1

2023

CDFSL-V: Cross-Domain Few-Shot Learning for Videos

ICCV 2023 1

2023

When Do Curricula Work in Federated Learning?

ICCV 2023 1

2022

Handwriting Transformers

ICCV 2021 10

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 24 papers