James M. Rehg

Papers: 11

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

11papers

Authored papers

Toward Cognitive Supersensing in Multimodal Large Language Model

arXiv 2026

2026

STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding

arXiv 2026

2026

Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders

CVPR 2025 1

2024

What is the Visual Cognition Gap between Humans and Multimodal LLMs?

arXiv 2024

2024

REBAR: Retrieval-Based Reconstruction for Time-series Contrastive Learning

arXiv 2023

2023

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

CVPR 2024 1

2023

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

CVPR 2024 1

2023

ZeroShape: Regression-based Zero-shot Shape Reconstruction

CVPR 2024 1

2023

Ego4D: Around the World in 3,000 Hours of Egocentric Video

CVPR 2022 1

2021

Fine-Grained Head Pose Estimation Without Keypoints

arXiv 2017

2017

Dockerface: an Easy to Install and Use Faster R-CNN Face Detector in a Docker Container

arXiv 2017

2017

Affiliations

No known affiliations.

Frequent co-authors

from 11 papers

Fiona Ryan

Jianguo Cao

Nataniel Ruiz

Wenqian Ye

Xu Cao

Yunsheng Ma

Abrham Gebreselasie

Adriano Fragomeni

Ajay Bati

Akshay Erapalli