publication . Conference object . Preprint . 2018

MTGAN: Speaker Verification through Multitasking Triplet Generative Adversarial Networks

Ding, Wenhao; He, Liang;
Open Access
  • Published: 02 Sep 2018
  • Publisher: ISCA
Abstract
Comment: submitted to Interspeech 2018
Subjects
free text keywords: Adversarial system, Speaker verification, Generative grammar, Human multitasking, Speech recognition, Computer science, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Related Organizations
30 references, page 1 of 2

[2] N. Dehak, P. J. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front-end factor analysis for speaker verification,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 4, pp. 788-798, 2011.

[3] E. Variani, X. Lei, E. McDermott, I. L. Moreno, and J. G. Dominguez, “Deep neural networks for small footprint textdependent speaker verification,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 2014. [OpenAIRE]

[4] C. Zhang and K. Koishida, “End-to-end text-independent speaker verification with triplet loss on short utterances,” in Interspeech, Stockholm, Sweden, 2017. [OpenAIRE]

[5] S. J. D. Prince and J. H. Elder, “Probabilistic linear discriminant analysis for inferences about identity,” in International Conference on Computer Vision (ICCV), Rio de Janeiro, Brazil, 2007.

[6] F. Schroff, D. Kalenichenko, and J. Philbin, “Probabilistic linear discriminant analysis for inferences about identity,” in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015.

[7] D. Cheng, Y. Gong, S. Zhou, J. Wang, and N. Zheng, “End-to-end text-independent speaker verification with triplet loss on short utterances,” in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, USA, 2016.

[8] W. Chen, X. Chen, J. Zhang, and K. Huang, “Beyond triplet loss: A deep quadruplet network for person re-identification,” in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, 2017.

[9] H. Alexander, B. Lucas, and L. Bastian, “In Defense of the Triplet Loss for Person Re-Identification,” arXiv preprint arXiv:1703.07737, 2017.

[10] L. Tran, X. Yin, and X. Liu, “Representation learning by rotating your faces,” arXiv preprint arXiv:1705.11136, 2017.

[11] A. Makhzani, N. J. J. Shlens, I. Goodfellow, and B. Frey, “Adversarial Autoencoders,” arXiv preprint arXiv:1511.05644, 2015.

[12] D. Snyder, D. Garcia-Romero, D. Povey, and S. Khudanpur, “Deep neural network embeddings for text-independent speaker verification,” in Interspeech, Stockholm, Sweden, 2017.

[13] C. Li*, X. Ma*, B. Jiang*, X. Li*, X. Zhang, X. Liu, Y. Cao, A. Kannan, and Z. Zhu, “Deep Speaker: an End-to-End Neural Speaker Embedding System,” arXiv preprint arXiv:1705.02304, 2017.

[14] L. Li, Y. Chen, Y. Shi, Z. Tang, and D. Wang, “Deep speaker feature learning for text-independent speaker verification,” in Interspeech, Stockholm, Sweden, 2017.

[15] K. Q. Weinberger and L. K. Saul, “Distance metric learning for large margin nearest neighbor classification,” Journal of Machine Learning Research, vol. 10, pp. 207-244, 2009.

[16] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016.

30 references, page 1 of 2
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue