Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2009
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2009
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2009
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Improved Text-Independent Speaker Identification Using Fused Mfcc And Imfcc Feature Sets Based On Gaussian Filter

Authors: Sandipan Chakroborty; Goutam Saha;

Improved Text-Independent Speaker Identification Using Fused Mfcc And Imfcc Feature Sets Based On Gaussian Filter

Abstract

{"references": ["J. P. Cambell, Jr., \"Speaker Recognition:A Tutorial\", Proceedings of\nThe IEEE, vol. 85, no. 9, pp. 1437-1462, Sept. 1997.", "Faundez-Zanuy M. and Monte-Moreno E., \"State-of-the-art in speaker\nrecognition\", Aerospace and Electronic Systems Magazine, IEEE, vol.\n20, No. 5, pp. 7-12, Mar. 2005", "S. B. Davis and P. Mermelstein, \"Comparison of Parametric\nRepresentation for Monosyllabic Word Recognition in Continuously\nSpoken Sentences\", IEEE Trans. On ASSP, vol. ASSP 28, no. 4, pp.\n357-365, Aug. 1980.", "R. Vergin, B. O- Shaughnessy and A. Farhat, \"Generalized Mel\nfrequency cepstral coefficients for large-vocabulary speakeridenpendent\ncontinuous-speech recognition, IEEE Trans. On ASSP,\nvol. 7, no. 5, pp. 525-532, Sept. 1999.", "Harrag A. Mohamadi T., Serignat J.F., \"LDA Combination of Pitch\nand MFCC Features in Speaker Recognition\", Proceedings of\nINDICON 2005, pp. 237-240, 11-13 Dec., IIT Chennai, India, 2005.", "K. Sri Rama Murty and B. Yegnanarayana, \"Combining evidence from\nresidual phase and MFCC features for speaker recognition\", IEEE\nSignal Processing Letters, vol 13, no. 1, pp. 52-55, Jan. 2006.", "Yegnanarayana B., Prasanna S.R.M., Zachariah J.M. and Gupta C. S.,\n\"Combining evidence from source, suprasegmental and spectral\nfeatures for a fixed-text speaker verification system\", IEEE Trans.\nSpeech and Audio Processing, Vol. 13, No. 4, pp. 575-582, July 2005.", "Chakroborty, S., Roy, A. and Saha, G., \"Improved Closed set Text-\nIndependent Speaker Identification by Combining MFCC with\nEvidence from Flipped Filter Banks\". International Journal of Signal\nProcessing, Vol. 4, No. 2, Page(s):114-122, 2007.", "J. Kittler, M. Hatef, R. Duin, J. Mataz, \"On combining classifiers\",\nIEEE Trans. Pattern Anal. Mach. Intell. 20 (1998) 226-239.\n[10] D. Reynolds, R. Rose, \"Robust text-independent speaker identification\nusing gaussian mixture speaker models\", IEEE Trans. Speech Audio\nProcess., vol. 3, no.1, pp. 72-83, Jan. 1995.\n[11] Laurent Besacier and Jean-Francois Bonastre, \"Subband architechute\nfor automatic speaker recognition\", Signal Processing, vol-80, pp.\n1245-1259, 2000.\n[12] R. P. Lippmann, ``Speech recognition by machines and humans\",\nSpeech Communication, vol. 22, No. 1, pp. 1-15, 1997.\n[13] Zheng F., Zhang, G. and Song, Z., \"Comparison of different\nimplementations of MFCC\", J. Computer Science & Technology, vol.\n16 no. 6, pp. 582-589, Sept. 2001.\n[14] Ganchev, T., Fakotakis, N., and Kokkinakis, G. \"Comparative\nEvaluation of Various MFCC Implementations on the Speaker\nVerification Task\", Proc. of SPECOM 2005, October 17-19, 2005.\nPatras, Greece, vol. 1, pp.191-194.\n[15] J. Campbell, \"Testing with the YOHO CDROM voice verification\ncorpus\", ICASSP95, 1995, vol.1 pp. 341-344.\n[16] Petrovska, D., et al. \"POLYCOST: A Telephone-Speech Database for\nSpeaker Recognition\", RLA2C, Avignon, France, April 20-23, 1998,\npp. 211-214.\n[17] D. O- Shaughnessy, Speech Communication Human and Machine,\nAddison-Wesley, New York, 1987.\n[18] Ben Gold and Nelson Morgan, Speech and Audio Signal Processing,\nPart- IV, Chap.14, pp. 189-203, John Willy & Sons ,2002.\n[19] Daniel J. Mashao, Marshalleno Skosan, \"Combining Classifier\nDecisions for Robust Speaker Identification\", Pattern Recog,, vol. 39,\npp. 147-155, 2006.\n[20] A. Papoulis and S. U. Pillai, \"Probability, Random variables and\nStochastic Processes\", Tata McGraw-Hill Edition, Fourth Edition,\nChap. 4, pp. 72-122, 2002.\n[21] Y. Linde, A. Buzo, and R. M. Gray, \"An algorithm for vector quantizer\ndesign\", IEEE Trans. Commun., vol. 28, no. 1, pp. 84-95, 1980.\n[22] Daniel Garcia-Romero, Julian Fierrez-Aguilar, Joaquin Gonzalez-\nRodriguez, Javier Ortega-Garcia, \"Using quality measures for\nmultilevel speaker recognition\", Computer Speech and Language, Vol.\n20, Issue 2-3, pp. 192-209, Apr. 2006,\n[23] S.R. Mahadeva Prasanna, Cheedella S. Gupta b, B. Yegnanarayana,\nExtraction of speaker-specific excitation information from linear\nprediction residual of speech\", Speech Communication, Vol. 48, Issue\n10, pp. 1243- 1261, October 2006.\n[24] H. Melin and J. Lindberg. \"Guidelines for experiments on the polycost\ndatabase\", In Proceedings of a COST 250 workshop on Application of\nSpeaker Recognition Techniques in Telephony, pp. 59- 69, Vigo,\nSpain, November 1996."]}

A state of the art Speaker Identification (SI) system requires a robust feature extraction unit followed by a speaker modeling scheme for generalized representation of these features. Over the years, Mel-Frequency Cepstral Coefficients (MFCC) modeled on the human auditory system has been used as a standard acoustic feature set for speech related applications. On a recent contribution by authors, it has been shown that the Inverted Mel- Frequency Cepstral Coefficients (IMFCC) is useful feature set for SI, which contains complementary information present in high frequency region. This paper introduces the Gaussian shaped filter (GF) while calculating MFCC and IMFCC in place of typical triangular shaped bins. The objective is to introduce a higher amount of correlation between subband outputs. The performances of both MFCC & IMFCC improve with GF over conventional triangular filter (TF) based implementation, individually as well as in combination. With GMM as speaker modeling paradigm, the performances of proposed GF based MFCC and IMFCC in individual and fused mode have been verified in two standard databases YOHO, (Microphone Speech) and POLYCOST (Telephone Speech) each of which has more than 130 speakers.

Keywords

Triangular Filter, GMM., MFCC, Gaussian Filter, Subbands, Correlation, IMFCC

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    1
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 3
    download downloads 3
  • 3
    views
    3
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
1
Average
Average
Average
3
3
Green