In this paper, we present a database of emotional speech intended to be open-sourced and used for synthesis and generation purpose. It contains data for male and female actors in English and a male actor in French. The database covers 5 emotion classes so it could be suitable to build synthesis and voice transformation systems with the potential to control the emotional dimension in a continuous way. We show the data's efficiency by building a simple MLP system converting neutral to angry speech style and evaluate it via a CMOS perception test. Even though the system is a very simple one, the test show the efficiency of the data which is promising for future work.
The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems
A dataset of emotional speech for synthesis and transformation systems demonstrates effectiveness in a perception test, showing promise for future applications.
- Year
- 2018
- Venue
- arXiv 2018
- Authors
- 5
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/1806.09514ARXIV-DEFAULT
- TL;DR
- Semantic Scholar