Bernard Ghanem
- Papers
- 55
Cite
Notes
Only stored in your browser.
Authored papers
55VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
arXiv 2026
TAPS: Task Aware Proposal Distributions for Speculative Sampling
arXiv 2026
QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation
arXiv 2026
ReactMotion: Generating Reactive Listener Motions from Speaker Utterance
arXiv 2026
NearID: Identity Representation Learning via Near-identity Distractors
arXiv 2026
OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
arXiv 2025
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection
arXiv 2025
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
CVPR 2025 1
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think
arXiv 2025
AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models
arXiv 2025
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers
arXiv 2025
DiffCLIP: Differential Attention Meets CLIP
arXiv 2025
Towards Data-Efficient Pretraining for Atomic Property Prediction
arXiv 2025
Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation
arXiv 2025
MOLE: Metadata Extraction and Validation in Scientific Papers Using LLMs
arXiv 2025
UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields
arXiv 2025
Hala Technical Report: Building Arabic-Centric Instruction & Translation Models at Scale
arXiv 2025
BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding
CVPR 2025 1
β-CLIP: Text-Conditioned Contrastive Learning for Multi-Granular Vision-Language Alignment
arXiv 2025
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding
arXiv 2025
Train Long, Think Short: Curriculum Learning for Efficient Reasoning
arXiv 2025
OASIS: Open Agent Social Interaction Simulations with One Million Agents
arXiv 2024
3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes
CVPR 2025 1
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents
arXiv 2024
SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap
arXiv 2024
RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything
arXiv 2024
GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering
arXiv 2024
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
arXiv 2024
Can Large Language Model Agents Simulate Human Trust Behavior?
arXiv 2024
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
arXiv 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
arXiv 2024
Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning
arXiv 2024
GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning
arXiv 2024
MatchDiffusion: Training-free Generation of Match-cuts
ICCV 2025
ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders
arXiv 2024
CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
NeurIPS 2023 11
Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors
arXiv 2023
A Unified Continual Learning Framework with General Parameter-Efficient Tuning
ICCV 2023 1
FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model
ICCV 2023 1
SoccerNet 2023 Challenges Results
arXiv 2023
Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right?
ICCV 2023 1
Re-ReND: Real-time Rendering of NeRFs across Devices
ICCV 2023 1
Improving GAN Training via Feature Space Shrinkage
arXiv 2023
Learning to Identify Critical States for Reinforcement Learning from Videos
ICCV 2023 1
Localizing Moments in Long Video Via Multimodal Guidance
ICCV 2023 1
Boundary-Denoising for Video Activity Localization
arXiv 2023
SoccerNet 2022 Challenges Results
arXiv 2022
Egocentric Video-Language Pretraining
arXiv 2022
EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries
ICCV 2023 1
Combating Mode Collapse in GANs via Manifold Entropy Estimation
arXiv 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video
CVPR 2022 1
AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation
arXiv 2021
SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation
arXiv 2021
VLG-Net: Video-Language Graph Matching Network for Video Grounding
arXiv 2020
Finding Moments in Video Collections Using Natural Language
arXiv 2019
Affiliations
Frequent co-authors
10from 55 papers