Show simple item record

dc.contributor.authorBruneau, P.
dc.contributor.authorParisot, O.
dc.contributor.authorMohammadi, Amir
dc.contributor.authorDemiroğlu, Cenk
dc.contributor.authorGhoniem, M.
dc.contributor.authorTamisier, T.
dc.date.accessioned2016-02-15T13:38:33Z
dc.date.available2016-02-15T13:38:33Z
dc.date.issued2014-05
dc.identifier.isbn978-2-9517408-8-4
dc.identifier.urihttp://hdl.handle.net/10679/2372
dc.description.abstractStatistical speech synthesis (SSS) models typically lie in a very high-dimensional space. They can be used to allow speech synthesis on digital devices, using only few sentences of input by the user. However, the adaptation algorithms of such weakly trained models suffer from the high dimensionality of the feature space. Because creating new voices is easy with the SSS approach, thousands of voices can be trained and a nearest-neighbor algorithm can be used to obtain better speaker similarity in those limited-data cases. Nearest-neighbor methods require good distance measures that correlate well with human perception. This paper investigates the problem of finding good low-cost metrics, i.e. simple functions of feature values that map with objective signal quality metrics. To this aim, we use high-dimensional data visualization and dimensionality reduction techniques. Data mining principles are also applied to formulate a tractable view of the problem, and propose tentative solutions. With a performance index improved by 36% w.r.t. a naive solution, while using only 0.77% of the respective amount of features, our results are promising. Perspectives on new adaptation algorithms, and tighter integration of data mining and visualization principles are eventually given.
dc.language.isoengen_US
dc.publisherEuropean Language Resources Association
dc.relation.ispartof9th International Conference on Language Resources and Evaluation (LREC)
dc.rightsrestrictedAccess
dc.titleFinding relevant features for statistical speech synthesis adaptationen_US
dc.typeConference paperen_US
dc.publicationstatuspublisheden_US
dc.contributor.departmentÖzyeğin University
dc.contributor.authorID(ORCID 0000-0002-6160-3169 & YÖK ID 144947) Demiroğlu, Cenk
dc.contributor.ozuauthorDemiroğlu, Cenk
dc.identifier.wosWOS:000355611001083
dc.subject.keywordsSpeech synthesis
dc.subject.keywordsSpeaker adaptation
dc.subject.keywordsFeature selection
dc.subject.keywordsVisual analytics
dc.contributor.ozugradstudentMohammadi, Amir
dc.contributor.authorMale2
dc.relation.publicationcategoryConference Paper - International - Institutional Academic Staff and Graduate Student


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record


Share this page