
This article discusses robust system by multi-classifier fusion approach used in target Speaker Detection (SD) systems to improve their performance. Single classifiers may introduce significant performance degradation in the performance. To overcome this problem, we propose in this work to apply the fusion of multi-classifiers Hierarchical Ascending Clustering (HAC), Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) on an architecture based on Activity Detection Voice (VAD) in order to reduce errors of speakers’ detection. A comparative investigation was conducted between individual classifiers and their fusion; and for the evaluation task, the three classifiers and their fusion were tested on telephonic conversations extracted from the NIST-2005 corpus. The results of experiments have shown that the applied multi-classifier fusion on this architecture has considerably enhanced the performances of target SD system, comparing to the applied each classifier. The results show a Speaker Detection Rate (SDR) of 99.18% with the fusion approach, compared to HAC (85.98%), GMM (86.68%), and SVM (97.67%).
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
