Wei-Ning Hsu

Papers: 12

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

12papers

Authored papers

SAM Audio: Segment Anything in Audio

arXiv 2025

2025

Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound

arXiv 2025

2025

FlowDec: A flow-based full-band general audio codec with high perceptual quality

arXiv 2025

2025

Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

arXiv 2024

2024

Movie Gen: A Cast of Media Foundation Models

arXiv 2024

2024

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

dinosr-self-distillation-and-online

2023

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

arXiv 2023

2023

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

arXiv 2022

2022

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

Preprint 2022 1

2022

Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction

learning-audio-visual-speech-representation

2022

Generative Spoken Language Modeling from Raw Audio

arXiv 2021

2021

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

from 12 papers

Bowen Shi

Yi-Chiao Wu

Adam Polyak

Alexei Baevski

Andros Tjandra

Ann Lee

Apoorv Vyas

John Hoffman

Kushal Lakhotia

Matt Le