Roman Bachmann
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6VideoFlexTok: Flexible-Length Coarse-to-Fine Video Tokenization
arXiv 2026
(1D) Ordered Tokens Enable Efficient Test-Time Search
arXiv 2026
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks
arXiv 2025
4M: Massively Multimodal Masked Modeling
4m-massively-multimodal-masked-modeling
MultiMAE: Multi-modal Multi-task Masked Autoencoders
arXiv 2022
Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans
omnidata-a-scalable-pipeline-for-making-multi
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers