Bowen Shi
- Papers
- 8
Cite
Notes
Only stored in your browser.
Authored papers
8SAM Audio: Segment Anything in Audio
arXiv 2025
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound
arXiv 2025
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
arXiv 2024
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding
arXiv 2024
Movie Gen: A Cast of Media Foundation Models
arXiv 2024
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
arXiv 2023
SEGA: Structural Entropy Guided Anchor View for Graph Contrastive Learning
arXiv 2023
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
learning-audio-visual-speech-representation
Affiliations
Frequent co-authors
10from 8 papers