0

Emotion Recognition From Speech With Recurrent Neural Networks

A deep recurrent neural network with a probabilistic CTC loss function achieves high accuracy in emotion recognition from speech, surpassing recent methods and human performance.

Year
2017
Venue
arXiv 2017
Authors
2
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/1701.08071v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

In this paper the task of emotion recognition from speech is considered. Proposed approach uses deep recurrent neural network trained on a sequence of acoustic features calculated over small speech intervals. At the same time special probabilistic-nature CTC loss function allows to consider long utterances containing both emotional and neutral parts. The effectiveness of such an approach is shown in two ways. Firstly, the comparison with recent advances in this field is carried out. Secondly, human performance on the same task is measured. Both criteria show the high quality of the proposed method.

Authors

2