Zhizheng Wu
- Papers
- 10
Cite
Notes
Only stored in your browser.
Authored papers
10Metis: A Foundation Speech Generation Model with Masked Generative Pre-training
arXiv 2025
TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling
arXiv 2025
Solla: Towards a Speech-Oriented LLM That Hears Acoustic Context
arXiv 2025
AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models
arXiv 2025
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
arXiv 2024
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds
arXiv 2024
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of Audio Events in Text-to-audio Generation
arXiv 2024
Foundation Models for Music: A Survey
arXiv 2024
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
arXiv 2024
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
arXiv 2024
Affiliations
Frequent co-authors
10from 10 papers