In the domain of music and sound processing, pitch extraction plays a pivotal role. Our research presents a specialized convolutional neural network designed for pitch extraction, particularly from the human singing voice in acapella performances. Notably, our approach combines synthetic data with auto-labeled acapella sung audio, creating a robust training environment. Evaluation across datasets comprising synthetic sounds, opera recordings, and time-stretched vowels demonstrates its efficacy. This work paves the way for enhanced pitch extraction in both music and voice settings.
Human Voice Pitch Estimation: A Convolutional Network with Auto-Labeled and Synthetic Data
A specialized convolutional neural network combines synthetic and auto-labeled acapella data for effective pitch extraction in various musical and vocal contexts.
- Year
- 2023
- Venue
- arXiv 2023
- Authors
- 1
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2308.07170v2ARXIV-DEFAULT
- TL;DR
- Semantic Scholar