0

Yinghao Ma

Papers
18

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
18papers

Authored papers

18

Audio-Visual Intelligence in Large Foundation Models

arXiv 2026

2026

CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction

arXiv 2026

2026

YuE: Scaling Open Foundation Models for Long-Form Music Generation

arXiv 2025

2025

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

arXiv 2025

2025

CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following

arXiv 2025

2025

AutoMV: An Automatic Multi-Agent System for Music Video Generation

arXiv 2025

2025

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

arXiv 2025

2025

Audio-FLAN: A Preliminary Release

arXiv 2025

2025

Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models

arXiv 2025

2025

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

arXiv 2024

2024

OmniBench: Towards The Future of Universal Omni-Language Models

arXiv 2024

2024

ChatMusician: Understanding and Generating Music Intrinsically with LLM

arXiv 2024

2024

ComposerX: Multi-Agent Symbolic Music Composition with LLMs

arXiv 2024

2024

Foundation Models for Music: A Survey

arXiv 2024

2024

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training

arXiv 2023

2023

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

arXiv 2023

2023

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response

arXiv 2023

2023

MERTech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model With Multi-Task Finetuning

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 18 papers