Trevor Darrell

UC Berkeley professor; one of the most cited computer-vision researchers and a co-founder of the BAIR Lab and BDD100K dataset.

Role: professor
Currently at: University of California, Berkeley
Scholar: scholar.google.com/citations
Papers: 69

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: scholar.google.com/citations

Attribution policy →

69papers

Authored papers

69

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

arXiv 2026

Learning a Generative Meta-Model of LLM Activations

arXiv 2026

LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

arXiv 2026

Describe Anything: Detailed Localized Image and Video Captioning

ICCV 2025

Reconstruction Alignment Improves Unified Multimodal Models

arXiv 2025

Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens

arXiv 2025

FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos

arXiv 2025

UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity

arXiv 2025

Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint

arXiv 2025

Constantly Improving Image Models Need Constantly Improving Benchmarks

arXiv 2025

Scaling Vision Pre-Training to 4K Resolution

CVPR 2025 1

Visually Prompted Benchmarks Are Surprisingly Fragile

arXiv 2025

Pillar-0: A New Frontier for Radiology Foundation Models

arXiv 2025

REOrdering Patches Improves Vision Models

arXiv 2025

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

arXiv 2025

Learning Adaptive Parallel Reasoning with Language Models

arXiv 2025

Video Action Differencing

arXiv 2025

Search Arena: Analyzing Search-Augmented LLMs

arXiv 2025

TULIP: Towards Unified Language-Image Pretraining

arXiv 2025

AutoPresent: Designing Structured Visuals from Scratch

CVPR 2025 1

Atlas: Multi-Scale Attention Improves Long Context Image Modeling

arXiv 2025

Navigation World Models

CVPR 2025 1

MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion

arXiv 2024

VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models

arXiv 2024

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

CVPR 2024 1

Segment Anything without Supervision

arXiv 2024

When Do We Not Need Larger Vision Models?

arXiv 2024

Neural Network Diffusion

arXiv 2024

InstanceDiffusion: Instance-level Control for Image Generation

CVPR 2024 1

Wolf: Captioning Everything with a World Summarization Framework

arXiv 2024

ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

arXiv 2024

Rethinking Patch Dependence for Masked Autoencoders

arXiv 2024

xT: Nested Tokenization for Larger Context in Large Images

arXiv 2024

Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark

arXiv 2024

CLAIR-A: Leveraging Large Language Models to Judge Audio Captions

arXiv 2024

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

arXiv 2024

Initializing Models with Larger Ones

arXiv 2023

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

CVPR 2024 1

Compositional Chain-of-Thought Prompting for Large Multimodal Models

CVPR 2024 1

Stochastic positional embeddings improve masked image modeling

arXiv 2023

PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor

arXiv 2023

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

arXiv 2023

Sequential Modeling Enables Scalable Learning for Large Vision Models

CVPR 2024 1

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

arXiv 2023

Hierarchical Open-vocabulary Universal Image Segmentation

hierarchical-open-vocabulary-universal-image

Dropout Reduces Underfitting

arXiv 2023

Unsupervised Universal Image Segmentation

CVPR 2024 1

Describing Differences in Image Sets with Natural Language

CVPR 2024 1

Guiding Pretraining in Reinforcement Learning with Large Language Models

arXiv 2023

Modular Visual Question Answering via Code Generation

arXiv 2023

A ConvNet for the 2020s

CVPR 2022 1

Visual Prompting via Image Inpainting

arXiv 2022

Back to the Source: Diffusion-Driven Test-Time Adaptation

arXiv 2022

Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning

ICCV 2023 1

Contrastive Test-Time Adaptation

CVPR 2022 1

Multitask Vision-Language Prompt Tuning

arXiv 2022

Refine and Represent: Region-to-Object Representation Learning

arXiv 2022

Fighting Gradients with Gradients: Dynamic Defenses against Adversarial Attacks

NeurIPS 2021 12

Tent: Fully Test-time Adaptation by Entropy Minimization

tent-fully-test-time-adaptation-by-entropy

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

bdd100k-a-diverse-driving-dataset-for

Rethinking the Value of Network Pruning

rethinking-the-value-of-network-pruning-1

Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders

arXiv 2018

Multimodal Explanations: Justifying Decisions and Pointing to the Evidence

multimodal-explanations-justifying-decisions-1

Deep Layer Aggregation

deep-layer-aggregation-1

SkipNet: Learning Dynamic Routing in Convolutional Networks

skipnet-learning-dynamic-routing-in-1

Context Encoders: Feature Learning by Inpainting

context-encoders-feature-learning-by-1

End-to-end Learning of Driving Models from Large-scale Video Datasets

end-to-end-learning-of-driving-models-from-1

Caffe: Convolutional Architecture for Fast Feature Embedding

arXiv 2014

DenseNet: Implementing Efficient ConvNet Descriptor Pyramids

arXiv 2014

Affiliations

Currently at

University of California, Berkeley

professor · university lab

Previously

MIT CSAILuniversity lab

Frequent co-authors

10

from 69 papers

Xudong Wang

12 shared papers

Long Lian

11 shared papers

David M. Chan

8 shared papers

Joseph E. Gonzalez

8 shared papers

Adam Yala

7 shared papers

Boyi Li

7 shared papers

Baifeng Shi

5 shared papers

Dequan Wang

5 shared papers

Evan Shelhamer

5 shared papers

Jiaxin Ge

5 shared papers