Publication:
DNN-based speaker-adaptive postfiltering with limited adaptation data for statistical speech synthesis systems

Placeholder

Institution Authors

Research Projects

Journal Title

Journal ISSN

Volume Title

Type

conferenceObject

Sub Type

Conference paper

Access

restrictedAccess

Publication Status

Published

Journal Issue

Abstract

Deep neural networks (DNNs) have been successfully deployed for acoustic modelling in statistical parametric speech synthesis (SPSS) systems. Moreover, DNN-based postfilters (PF) have also been shown to outperform conventional postfilters that are widely used in SPSS systems for increasing the quality of synthesized speech. However, existing DNN-based postfilters are trained with speaker-dependent databases. Given that SPSS systems can rapidly adapt to new speakers from generic models, there is a need for DNN-based postfilters that can adapt to new speakers with minimal adaptation data. Here, we compare DNN-, RNN-, and CNN-based postfilters together with adversarial (GAN) training and cluster-based initialization (CI) for rapid adaptation. Results indicate that the feedforward (FF) DNN, together with GAN and CI, significantly outperforms the other recently proposed postfilters.

Date

2019

Publisher

IEEE

Description

Keywords

Citation

Collections


0

Views

0

Downloads