OCR-aided person annotation and label propagation for speaker modeling in TV shows

Budnik, M.; Besacier, L.; Khodabakhsh, Ali; Demiroğlu, Cenk

Publication:
OCR-aided person annotation and label propagation for speaker modeling in TV shows

dc.contributor.author	Budnik, M.
dc.contributor.author	Besacier, L.
dc.contributor.author	Khodabakhsh, Ali
dc.contributor.author	Demiroğlu, Cenk
dc.contributor.department	Electrical & Electronics Engineering
dc.contributor.ozuauthor	DEMİROĞLU, Cenk
dc.contributor.ozugradstudent	Khodabakhsh, Ali
dc.date.accessioned	2016-07-29T05:25:57Z
dc.date.available	2016-07-29T05:25:57Z
dc.date.issued	2016
dc.description	Due to copyright restrictions, the access to the full text of this article is only available via subscription.
dc.description.abstract	In this paper, we present an approach for minimizing human effort in manual speaker annotation. Label propagation is used at each iteration of an active learning cycle. More precisely, a selection strategy for choosing the most suitable speech track to be labeled is proposed. Four different selection strategies are evaluated and all the tracks in a corresponding cluster are gathered using agglomerative clustering in order to propagate human annotations. To further reduce the manual labor required, an optical character recognition system is used to bootstrap annotations. At each step of the cycle, annotations are used to build speaker models. The quality of the generated speaker models is evaluated at each step using an i-vector based speaker identification system. The presented approach shows promising results on the REPERE corpus with a minimum amount of human effort for annotation.
dc.identifier.doi	10.1109/ICASSP.2016.7472743
dc.identifier.endpage	5574
dc.identifier.issn	1520-6149
dc.identifier.scopus	2-s2.0-84973301088
dc.identifier.startpage	5570
dc.identifier.uri	http://hdl.handle.net/10679/4330
dc.identifier.uri	https://doi.org/10.1109/ICASSP.2016.7472743
dc.identifier.wos	000388373405144
dc.language.iso	eng	en_US
dc.peerreviewed	yes
dc.publicationstatus	published	en_US
dc.publisher	IEEE
dc.relation.ispartof	2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
dc.relation.publicationcategory	International
dc.rights	restrictedAccess
dc.subject.keywords	Active learning
dc.subject.keywords	Annotation propagation
dc.subject.keywords	Clustering
dc.subject.keywords	Speaker identification
dc.subject.keywords	OCR
dc.title	OCR-aided person annotation and label propagation for speaker modeling in TV shows	en_US
dc.type	conferenceObject	en_US
dspace.entity.type	Publication
relation.isOrgUnitOfPublication	7b58c5c4-dccc-40a3-aaf2-9b209113b763
relation.isOrgUnitOfPublication.latestForDiscovery	7b58c5c4-dccc-40a3-aaf2-9b209113b763

Collections

Computer Science

Publication: OCR-aided person annotation and label propagation for speaker modeling in TV shows

Files

Collections

Publication:
OCR-aided person annotation and label propagation for speaker modeling in TV shows