Yinghao Ma
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18Audio-Visual Intelligence in Large Foundation Models
arXiv 2026
CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction
arXiv 2026
YuE: Scaling Open Foundation Models for Long-Form Music Generation
arXiv 2025
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
arXiv 2025
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
arXiv 2025
AutoMV: An Automatic Multi-Agent System for Music Video Generation
arXiv 2025
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
arXiv 2025
Audio-FLAN: A Preliminary Release
arXiv 2025
Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models
arXiv 2025
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
arXiv 2024
OmniBench: Towards The Future of Universal Omni-Language Models
arXiv 2024
ChatMusician: Understanding and Generating Music Intrinsically with LLM
arXiv 2024
ComposerX: Multi-Agent Symbolic Music Composition with LLMs
arXiv 2024
Foundation Models for Music: A Survey
arXiv 2024
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
arXiv 2023
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
arXiv 2023
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
arXiv 2023
MERTech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model With Multi-Task Finetuning
arXiv 2023
Affiliations
Frequent co-authors
10from 18 papers