0

Yuki Mitsufuji

Papers
25

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
25papers

Authored papers

25

DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning

arXiv 2025

2025

Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior

arXiv 2025

2025

HumanGif: Single-View Human Diffusion with Generative Prior

arXiv 2025

2025

Training Consistency Models with Variational Noise Coupling

arXiv 2025

2025

A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

arXiv 2025

2025

MeanFlow Transformers with Representation Autoencoders

arXiv 2025

2025

CARE: Aligning Language Models for Regional Cultural Awareness

arXiv 2025

2025

MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

CVPR 2025 1

2024

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping

arXiv 2024

2024

Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning

arXiv 2024

2024

SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation

arXiv 2024

2024

A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation

arXiv 2024

2024

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

arXiv 2024

2024

MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation

arXiv 2024

2024

MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models

arXiv 2024

2024

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

arXiv 2023

2023

BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network

bigvsan-enhancing-gan-based-neural-vocoders

2023

SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer

arXiv 2023

2023

GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

arXiv 2023

2023

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events

NeurIPS 2023 11

2023

PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives

arXiv 2023

2023

Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects

arXiv 2022

2022

ComFact: A Benchmark for Linking Contextual Commonsense Knowledge

arXiv 2022

2022

CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos

arXiv 2022

2022

D3Net: Densely connected multidilated DenseNet for music source separation

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 25 papers