Browsing by Author "Budnik, M."

Now showing 1 - 2 of 2

Metadata only
LIG at MediaEval 2015 multimodal person discovery in broadcast TV task
(CEUR-WS, 2015) Budnik, M.; Safadi, B.; Besacier, L.; Quénot, G.; Khodabakhsh, Ali; Demiroğlu, Cenk; Electrical & Electronics Engineering; DEMİROĞLU, Cenk; Khodabakhsh, Ali
In this working notes paper the contribution of the LIG team (partnership between Univ. Grenoble Alpes and Ozyegin University) to the Multimodal Person Discovery in Broadcast TV task in MediaEval 2015 is presented. The task focused on unsupervised learning techniques. Two different approaches were submitted by the team. In the first one, new features for face and speech modalities were tested. In the second one, an alternative way to calculate the distance between face tracks and speech segments is presented. It also had a competitive MAP score and was able to beat the baseline.
Metadata only
OCR-aided person annotation and label propagation for speaker modeling in TV shows
(IEEE, 2016) Budnik, M.; Besacier, L.; Khodabakhsh, Ali; Demiroğlu, Cenk; Electrical & Electronics Engineering; DEMİROĞLU, Cenk; Khodabakhsh, Ali
In this paper, we present an approach for minimizing human effort in manual speaker annotation. Label propagation is used at each iteration of an active learning cycle. More precisely, a selection strategy for choosing the most suitable speech track to be labeled is proposed. Four different selection strategies are evaluated and all the tracks in a corresponding cluster are gathered using agglomerative clustering in order to propagate human annotations. To further reduce the manual labor required, an optical character recognition system is used to bootstrap annotations. At each step of the cycle, annotations are used to build speaker models. The quality of the generated speaker models is evaluated at each step using an i-vector based speaker identification system. The presented approach shows promising results on the REPERE corpus with a minimum amount of human effort for annotation.