Christopher Clark

Papers: 12

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

12papers

Authored papers

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

arXiv 2026

2026

Unified Spatio-Temporal Token Scoring for Efficient Video VLMs

arXiv 2026

2026

Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation

arXiv 2025

2025

SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning

arXiv 2025

2025

2 OLMo 2 Furious

arXiv 2024

2024

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models

CVPR 2025 1

2024

One Diffusion to Generate Them All

CVPR 2025 1

2024

Holodeck: Language Guided Generation of 3D Embodied AI Environments

CVPR 2024 1

2023

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

arXiv 2023

2023

Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models

arXiv 2023

2023

A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge

arXiv 2022

2022

I Can't Believe There's No Images! Learning Visual Tasks Using only Language Supervision

ICCV 2023 1

2022

Affiliations

No known affiliations.

Frequent co-authors

from 12 papers

Aniruddha Kembhavi

Ranjay Krishna

Sangho Lee

Yue Yang

Rohun Tripathi

Ali Farhadi

CEO

Chris Callison-Burch

Jiasen Lu

Luca Weihs

Mark Yatskar