Junbo Zhang
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11MiDashengLM: Efficient Audio Understanding with General Audio Captions
arXiv 2025
GLAP: General contrastive audio-text pretraining across domains and languages
arXiv 2025
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
arXiv 2025
JoyAgent-JDGenie: Technical Report on the GAIA
arXiv 2025
Unified Vision-Language-Action Model
arXiv 2025
Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering
arXiv 2025
CED: Consistent ensemble distillation for audio tagging
arXiv 2023
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
arXiv 2022
Contrastive Deep Supervision
arXiv 2022
speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
arXiv 2021
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
arXiv 2021
Affiliations
Frequent co-authors
10from 11 papers