Hilde Kuehne
- Papers
- 19
Cite
Notes
Only stored in your browser.
Authored papers
19AMoE: Agglomerative Mixture-of-Experts Vision Foundation Model
arXiv 2025
TTRV: Test-Time Reinforcement Learning for Vision Language Models
arXiv 2025
mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition
arXiv 2025
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes
arXiv 2025
ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
arXiv 2024
Teaching VLMs to Localize Specific Objects from In-context Examples
ICCV 2025
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
arXiv 2024
DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners
arXiv 2024
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
arXiv 2023
Preserving Modality Structure Improves Multi-Modal Learning
ICCV 2023 1
Grounding Everything: Emerging Localization Properties in Vision-Language Transformers
CVPR 2024 1
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
ICCV 2023 1
Learning by Sorting: Self-supervised Learning with Group Ordering Constraints
ICCV 2023 1
In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval
ICCV 2023 1
What, when, and where? -- Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
arXiv 2023
Contrastive Audio-Visual Masked Autoencoder
arXiv 2022
Monotonic Differentiable Sorting Networks
monotonic-differentiable-sorting-networks
Video Test-Time Adaptation for Action Recognition
CVPR 2023 1
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
ICCV 2021 10
Affiliations
Frequent co-authors
10from 19 papers