Papers

Trending research and the full catalog - each paper linked to the benchmarks, methods, and models it introduces.

Filtered by domain: audio-generationClear

LTX-2: Efficient Joint Audio-Visual Foundation Model

Yoav HaCohen, Benny Brazowski, Nisan Chiprut et al. · 6 Jan 2026

Recent text-to-video diffusion models can generate compelling video sequences, yet they remain silent -- missing the semantic, emotional, and atmospheric cues that audio provides.

Audio Generation Video generation

7.3k

Stable Audio 3

Zach Evans, Julian D. Parker, Matthew Rice et al. · 18 May 2026

Stable Audio 3 is a family of fast latent diffusion models (small, medium, large) for variable-length audio generation and editing.

Audio Generation Image Inpainting

5270.0/h