Publication: Nearest neighbor approach in speaker adaptation for HMM-based speech synthesis
Institution Authors
Authors
Journal Title
Journal ISSN
Volume Title
Type
conferenceObject
Access
restrictedAccess
Publication Status
published
Abstract
Statistical speech synthesis (SSS) approach has become one of the most popular and successful methods in the speech synthesis field. Smooth speech transitions, without the spurious errors that are observed in unit selection systems, can be generated with the SSS approach. Another advantage is the ability to adapt to a target speaker with a couple of minutes of adaptation data. However, many applications, especially in consumer electronics, require adaptation with only a few adaptation utterances. Here, we propose a rapid adaptation technique that first attempt to select a reference model that is close to the target speaker given a distance measure. Then, as opposed to adapting to target speaker from an average model, as typically done in most systems, adaptation is performed from the new reference model. The proposed system significantly outperformed a state-of-the-art baseline system both in objective and subjective tests especially only when one utterance is available for adaptation.
Date
2013
Publisher
IEEE
Description
Due to copyright restrictions, the access to the full text of this article is only available via subscription.