VIVOLAB-UZ Speaker Diarization System for the Albayzin 2010 Evaluation Campaign

Carlos Vaquero, Alfonso Ortega, Eduardo Lleida

Abstract: This paper describes the speaker diarization systems proposed by the VIVOLAB-UZ group for the Albayzin 2010 speaker diarization evaluation. Our approaches combine recent improvementes in the field of speaker segmentation in two speaker telephone conversations, using eigenvoice modeling, with the traditional Agglomerative Hierarchical Clustering approach. We are presenting two submissions. Our first system uses a simple eigenvoice factor analysis model to extract a stream of speaker factor for every recording that enable better speaker separability. The speaker factor stream is used for speaker segmentation. Then, both the clusters obtained are agglomerated using Bayesian Information Criterion as distance metric, obtaining the speaker labels. Our second submission is exactly the same system, but it uses Viterbi resegmentation to refine speaker change points as a final step.

Index Terms: Speaker diarization, Factor Analysis, intrasession variability, Agglomerative Hierarchical Clustering, Bayesian Information Criterion.

Full Paper