NatiQ: An end-to-end text-to-speech system for arabic

Abdelali, A.; Durrani, N.; Demiroğlu, Cenk; Dalvi, F.; Mubarak, H.; Darwish, K.

Publication:
NatiQ: An end-to-end text-to-speech system for arabic

dc.contributor.author	Abdelali, A.
dc.contributor.author	Durrani, N.
dc.contributor.author	Demiroğlu, Cenk
dc.contributor.author	Dalvi, F.
dc.contributor.author	Mubarak, H.
dc.contributor.author	Darwish, K.
dc.contributor.department	Electrical & Electronics Engineering
dc.contributor.ozuauthor	DEMİROĞLU, Cenk
dc.date.accessioned	2023-08-03T12:13:30Z
dc.date.available	2023-08-03T12:13:30Z
dc.date.issued	2022
dc.description.abstract	NatiQ is end-to-end text-to-speech system for Arabic. Our speech synthesizer uses an encoder-decoder architecture with attention. We used both tacotron-based models (tacotron-1 and tacotron-2) and the faster transformer model for generating mel-spectrograms from characters. We concatenated Tacotron1 with the WaveRNN vocoder, Tacotron2 with the WaveGlow vocoder and ESPnet transformer with the parallel wavegan vocoder to synthesize waveforms from the spectrograms. We used in-house speech data for two voices: 1) neutral male “Hamza”- narrating general content and news, and 2) expressive female “Amina”narrating children story books to train our models. Our best systems achieve an average Mean Opinion Score (MOS) of 4.21 and 4.40 for Amina and Hamza respectively.The objective evaluation of the systems using word and character error rate (WER and CER) as well as the response time measured by real-time factor favored the end-to-end architecture ESPnet.NatiQ demo is available online at https://tts.qcri.org.
dc.identifier.endpage	398
dc.identifier.isbn	978-195942927-2
dc.identifier.scopus	2-s2.0-85152915467
dc.identifier.startpage	394
dc.identifier.uri	http://hdl.handle.net/10679/8557
dc.language.iso	eng
dc.publicationstatus	Published
dc.publisher	Association for Computational Linguistics (ACL)
dc.relation.ispartof	WANLP 2022 - 7th Arabic Natural Language Processing - Proceedings of the Workshop
dc.relation.publicationcategory	International
dc.rights	restrictedAccess
dc.title	NatiQ: An end-to-end text-to-speech system for arabic
dc.type	conferenceObject
dc.type.subtype	Conference paper
dspace.entity.type	Publication
relation.isOrgUnitOfPublication	7b58c5c4-dccc-40a3-aaf2-9b209113b763
relation.isOrgUnitOfPublication.latestForDiscovery	7b58c5c4-dccc-40a3-aaf2-9b209113b763

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.45 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Computer Science

Publication: NatiQ: An end-to-end text-to-speech system for arabic

Files

License bundle

Collections

Publication:
NatiQ: An end-to-end text-to-speech system for arabic