0

JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis

A new Japanese speech corpus, JSUT, is designed for end-to-end speech synthesis and is freely available online.

Year
2017
Venue
arXiv 2017
Authors
3
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/1711.00354ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Thanks to improvements in machine learning techniques including deep learning, a free large-scale speech corpus that can be shared between academic institutions and commercial companies has an important role. However, such a corpus for Japanese speech synthesis does not exist. In this paper, we designed a novel Japanese speech corpus, named the "JSUT corpus," that is aimed at achieving end-to-end speech synthesis. The corpus consists of 10 hours of reading-style speech data and its transcription and covers all of the main pronunciations of daily-use Japanese characters. In this paper, we describe how we designed and analyzed the corpus. The corpus is freely available online.

Authors

3