Speaker Clustering Using Dominant Sets

descriptionPublicationkeyboard_double_arrow_right Article , Conference object , Preprint 01 Aug 2018Embargo end date: 25 May 2018Publisher:IEEEJournal:2018 24th International Conference on Pattern Recognition (ICPR)

Authors: Feliks Hibraj; Sebastiano Vascon; Thilo Stadelmann; Marcello Pelillo;

doi: 10.1109/icpr.2018.8546067 , 10.21256/zhaw-4254 , 10.48550/arxiv.1805.08641

arXiv: 1805.08641

Speaker Clustering Using Dominant Sets

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

Speaker clustering is the task of forming speaker-specific groups based on a set of utterances. In this paper, we address this task by using Dominant Sets (DS). DS is a graphbased clustering algorithm with interesting properties that fits well to our problem and has never been applied before to speaker clustering. We report on a comprehensive set of experiments on the TIMIT dataset against standard clustering techniques and specific speaker clustering methods. Moreover, we compare performances under different features by using ones learned via deep neural network directly on TIMIT and other ones extracted from a pre-trained VGGVox net. To asses the stability, we perform a sensitivity analysis on the free parameters of our method, showing that performance is stable under parameter changes. The extensive experimentation carried out confirms the validity of the proposed method, reporting state-of-the-art results under three different standard metrics. We also report reference baseline results for speaker clustering on the entire TIMIT dataset for the first time.

Related Organizations

Winterthur Museum Garden and Library
United States
Ca Foscari University of Venice
Italy

Keywords

FOS: Computer and information sciences, Speaker recognition, Sound (cs.SD), Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, 006: Spezielle Computerverfahren, Speaker embeddings, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing

2 Research products, page 1 of 1

SCDS software on GitHub
IsRelatedTo
VGGVox software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average