Publication:
Analysis of speaker similarity in the statistical speech synthesis systems using a hybrid approach

dc.contributor.authorGüner, Ekrem
dc.contributor.authorMohammadi, A.
dc.contributor.authorDemiroğlu, Cenk
dc.contributor.departmentElectrical & Electronics Engineering
dc.contributor.ozuauthorDEMİROĞLU, Cenk
dc.contributor.ozugradstudentGüner, Ekrem
dc.date.accessioned2014-11-25T11:34:41Z
dc.date.available2014-11-25T11:34:41Z
dc.date.issued2012
dc.descriptionDue to copyright restrictions, the access to the full text of this article is only available via subscription.en_US
dc.description.abstractStatistical speech synthesis (SSS) approach has become one of the most popular and successful methods in the speech synthesis field. Smooth speech transitions, without the spurious errors that are observed in unit selection systems, can be generated with the SSS approach. However, a well-known issue with SSS is the lack of voice similarity to the target speaker. The issue arises both in speaker-dependent models and models that are adapted from average voices. Moreover, in speaker adaptation, similarity to the target speaker does not increase significantly after around one minute of adaptation data which potentially indicates inherent bottleneck(s) in the system. Here, we propose using the hybrid speech synthesis approach to understand the key factors behind the speaker similarity problem. To that end, we try to answer the following question: which segments and parameters of speech, if generated/synthesized better, would have a substantial improvement on speaker similarity? In this work, our hybrid methods are described and listening test results are presented and discussed.en_US
dc.identifier.endpage2059
dc.identifier.isbn978-1-4673-1068-0
dc.identifier.scopus2-s2.0-84869747260
dc.identifier.startpage2055
dc.identifier.urihttp://hdl.handle.net/10679/676
dc.identifier.wos000310623800413
dc.language.isoengen_US
dc.peerreviewedyesen_US
dc.publicationstatuspublisheden_US
dc.publisherIEEEen_US
dc.relation.ispartofSignal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
dc.relation.publicationcategoryInternational
dc.rightsrestrictedAccess
dc.subject.keywordsSpeaker recognitionen_US
dc.subject.keywordsSpeech synthesisen_US
dc.subject.keywordsStatistical analysisen_US
dc.titleAnalysis of speaker similarity in the statistical speech synthesis systems using a hybrid approachen_US
dc.typeconferenceObjecten_US
dspace.entity.typePublication
relation.isOrgUnitOfPublication7b58c5c4-dccc-40a3-aaf2-9b209113b763
relation.isOrgUnitOfPublication.latestForDiscovery7b58c5c4-dccc-40a3-aaf2-9b209113b763

Files

License bundle

Now showing 1 - 1 of 1
Placeholder
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections