Cross-lingual speaker adaptation for statistical speech synthesis using limited data

Sarfjoo, Seyyed Saeed; Demiroğlu, Cenk

dc.contributor.author	Sarfjoo, Seyyed Saeed
dc.contributor.author	Demiroğlu, Cenk
dc.date.accessioned	2017-01-31T11:25:15Z
dc.date.available	2017-01-31T11:25:15Z
dc.date.issued	2016
dc.identifier.issn	2308-457X	en_US
dc.identifier.uri	http://hdl.handle.net/10679/4758
dc.description.abstract	Cross-lingual speaker adaptation with limited adaptation data has many applications such as use in speech-to-speech translation systems. Here, we focus on cross-lingual adaptation for statistical speech synthesis (SSS) systems using limited adaptation data. To that end, we propose two techniques exploiting a bilingual Turkish-English speech database that we collected. In one approach, speaker-specific state-mapping is proposed for cross-lingual adaptation which performed significantly better than the baseline state-mapping algorithm in adapting the excitation parameter both in objective and subjective tests. In the second approach, eigenvoice adaptation is done in the input language which is then used to estimate the eigenvoice weights in the output language using weighted linear regression. The second approach performed significantly better than the baseline system in adapting the spectral envelope parameters both in objective and subjective tests.	en_US
dc.language.iso	eng	en_US
dc.publisher	Interspeech	en_US
dc.relation.ispartof	Proceedings of the Annual Conference of the International Speech Communication Association	en_US
dc.rights	restrictedAccess
dc.title	Cross-lingual speaker adaptation for statistical speech synthesis using limited data	en_US
dc.type	Conference paper	en_US
dc.publicationstatus	published	en_US
dc.contributor.department	Özyeğin University
dc.contributor.authorID	(ORCID 0000-0002-6160-3169 & YÖK ID 144947) Demiroğlu, Cenk
dc.contributor.ozuauthor	Demiroğlu, Cenk
dc.identifier.startpage	317	en_US
dc.identifier.endpage	321	en_US
dc.identifier.doi	10.21437/Interspeech.2016-345	en_US
dc.subject.keywords	Cross lingual speaker adaptation	en_US
dc.subject.keywords	Eigenvoice adaptation	en_US
dc.subject.keywords	Nearest-neighbor	en_US
dc.subject.keywords	Speaker adaptation	en_US
dc.subject.keywords	Statistical speech synthesis	en_US
dc.identifier.scopus	SCOPUS:2-s2.0-84994385942
dc.contributor.ozugradstudent	Sarfjoo, Seyyed Saeed
dc.contributor.authorMale	2
dc.relation.publicationcategory	Conference Paper - International - Institutional Academic Staff and PhD Student