Spoofing voice verification systems with statistical speech synthesis using limited adaptation data

Khodabakhsh, Ali; Mohammadi, Amir; Demiroğlu, Cenk

Publication:
Spoofing voice verification systems with statistical speech synthesis using limited adaptation data

dc.contributor.author	Khodabakhsh, Ali
dc.contributor.author	Mohammadi, Amir
dc.contributor.author	Demiroğlu, Cenk
dc.contributor.department	Electrical & Electronics Engineering
dc.contributor.ozuauthor	DEMİROĞLU, Cenk
dc.contributor.ozugradstudent	Khodabakhsh, Ali
dc.contributor.ozugradstudent	Mohammadi, Amir
dc.date.accessioned	2017-02-20T11:27:34Z
dc.date.available	2017-02-20T11:27:34Z
dc.date.issued	2017-03
dc.description.abstract	State-of-the-art speaker verification systems are vulnerable to spoofing attacks using speech synthesis. To solve the issue, high-performance synthetic speech detectors (SSDs) for attack methods have been proposed recently. Here, as opposed to developing new detectors, we investigate new attack strategies. Investigating new techniques that are specifically tailored for spoofing attacks that can spoof the voice verification system and are difficult to detect is expected to increase the security of voice verification systems by enabling the development of better detectors. First, we investigated the vulnerability of an i-vector based verification system to attacks using statistical speech synthesis (SSS), with a particular focus on the case where the attacker has only a very limited amount of data from the target speaker. Even with a single adaptation utterance, the false alarm rate was found to be 23%. Still, SSS-generated speech is easy to detect (Wu et al., 2015a, 2015b), which dramatically reduces its effectiveness. For more effective attacks with limited data, we propose a hybrid statistical/concatenative synthesis approach and show that hybrid synthesis significantly increases the false alarm rate in the verification system compared to the baseline SSS method. Moreover, proposed hybrid synthesis makes detecting synthetic speech more difficult compared to SSS even when very limited amount of original speech recordings are available to the attacker. To further increase the effectiveness of the attacks, we propose a linear regression method that transforms synthetic features into more natural features. Even though the regression approach is more effective at spoofing the detectors, it is not as effective as the hybrid synthesis approach in spoofing the verification system. An interpolation approach is proposed to combine the linear regression and hybrid synthesis methods, which is shown to provide the best spoofing performance in most cases.	en_US
dc.identifier.doi	10.1016/j.csl.2016.08.004	en_US
dc.identifier.endpage	37	en_US
dc.identifier.issn	0885-2308	en_US
dc.identifier.scopus	2-s2.0-84986909494
dc.identifier.startpage	20	en_US
dc.identifier.uri	http://hdl.handle.net/10679/4799
dc.identifier.uri	https://doi.org/10.1016/j.csl.2016.08.004
dc.identifier.volume	42	en_US
dc.identifier.wos	000390501300002
dc.language.iso	eng	en_US
dc.peerreviewed	yes	en_US
dc.publicationstatus	published	en_US
dc.publisher	Elsevier	en_US
dc.relation.ispartof	Computer Speech & Language	en_US
dc.rights	restrictedAccess
dc.subject.keywords	Statistical speech synthesis	en_US
dc.subject.keywords	Hybrid speech synthesis	en_US
dc.subject.keywords	Spoofing verification systems	en_US
dc.subject.keywords	Speaker adaptation	en_US
dc.subject.keywords	Synthetic speech detection	en_US
dc.title	Spoofing voice verification systems with statistical speech synthesis using limited adaptation data	en_US
dc.type	article	en_US
dspace.entity.type	Publication
relation.isOrgUnitOfPublication	7b58c5c4-dccc-40a3-aaf2-9b209113b763
relation.isOrgUnitOfPublication.latestForDiscovery	7b58c5c4-dccc-40a3-aaf2-9b209113b763

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.45 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Computer Science

Publication: Spoofing voice verification systems with statistical speech synthesis using limited adaptation data

Files

License bundle

Collections

Publication:
Spoofing voice verification systems with statistical speech synthesis using limited adaptation data