David M. Chan
- Papers
- 13
Cite
Notes
Only stored in your browser.
Authored papers
13Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing
arXiv 2026
VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents
arXiv 2026
REOrdering Patches Improves Vision Models
arXiv 2025
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling
arXiv 2025
TULIP: Towards Unified Language-Image Pretraining
arXiv 2025
Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint
arXiv 2025
Constantly Improving Image Models Need Constantly Improving Benchmarks
arXiv 2025
Higher-Order Binding of Language Model Virtual Personas: a Study on Approximating Political Partisan Misperceptions
arXiv 2025
Virtual Personas for Language Models via an Anthology of Backstories
arXiv 2024
ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video
arXiv 2024
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
arXiv 2024
CLAIR-A: Leveraging Large Language Models to Judge Audio Captions
arXiv 2024
Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition
arXiv 2024
Affiliations
Frequent co-authors
10from 13 papers