Finding relevant features for statistical speech synthesis adaptation

Bruneau, P.; Parisot, O.; Mohammadi, Amir; Demiroğlu, Cenk; Ghoniem, M.; Tamisier, T.

Publication:
Finding relevant features for statistical speech synthesis adaptation

dc.contributor.author	Bruneau, P.
dc.contributor.author	Parisot, O.
dc.contributor.author	Mohammadi, Amir
dc.contributor.author	Demiroğlu, Cenk
dc.contributor.author	Ghoniem, M.
dc.contributor.author	Tamisier, T.
dc.contributor.department	Electrical & Electronics Engineering
dc.contributor.ozuauthor	DEMİROĞLU, Cenk
dc.contributor.ozugradstudent	Mohammadi, Amir
dc.date.accessioned	2016-02-15T13:38:33Z
dc.date.available	2016-02-15T13:38:33Z
dc.date.issued	2014-05
dc.description.abstract	Statistical speech synthesis (SSS) models typically lie in a very high-dimensional space. They can be used to allow speech synthesis on digital devices, using only few sentences of input by the user. However, the adaptation algorithms of such weakly trained models suffer from the high dimensionality of the feature space. Because creating new voices is easy with the SSS approach, thousands of voices can be trained and a nearest-neighbor algorithm can be used to obtain better speaker similarity in those limited-data cases. Nearest-neighbor methods require good distance measures that correlate well with human perception. This paper investigates the problem of finding good low-cost metrics, i.e. simple functions of feature values that map with objective signal quality metrics. To this aim, we use high-dimensional data visualization and dimensionality reduction techniques. Data mining principles are also applied to formulate a tractable view of the problem, and propose tentative solutions. With a performance index improved by 36% w.r.t. a naive solution, while using only 0.77% of the respective amount of features, our results are promising. Perspectives on new adaptation algorithms, and tighter integration of data mining and visualization principles are eventually given.
dc.identifier.isbn	978-2-9517408-8-4
dc.identifier.uri	http://hdl.handle.net/10679/2372
dc.identifier.wos	000355611001083
dc.language.iso	eng	en_US
dc.publicationstatus	published	en_US
dc.publisher	European Language Resources Association
dc.relation.ispartof	9th International Conference on Language Resources and Evaluation (LREC)
dc.relation.publicationcategory	International
dc.rights	restrictedAccess
dc.subject.keywords	Speech synthesis
dc.subject.keywords	Speaker adaptation
dc.subject.keywords	Feature selection
dc.subject.keywords	Visual analytics
dc.title	Finding relevant features for statistical speech synthesis adaptation	en_US
dc.type	conferenceObject	en_US
dspace.entity.type	Publication
relation.isOrgUnitOfPublication	7b58c5c4-dccc-40a3-aaf2-9b209113b763
relation.isOrgUnitOfPublication.latestForDiscovery	7b58c5c4-dccc-40a3-aaf2-9b209113b763

Collections

Computer Science

Publication: Finding relevant features for statistical speech synthesis adaptation

Files

Collections

Publication:
Finding relevant features for statistical speech synthesis adaptation