A tensor-based approach for automatic music genre classification

Conference object, Unknown OPEN
Benetos, Emmanouil ; Kotropoulos, Costas (2008)
  • Related identifiers: doi: 10.5281/zenodo.41063
  • Subject: M1 | QA75
    arxiv: Computer Science::Sound
    acm: ComputingMethodologies_PATTERNRECOGNITION

Most music genre classification techniques employ pattern recognition algorithms to classify feature vectors extracted from recordings into genres. An automatic music genre classification system using tensor representations is proposed, where each recording is represented by a feature matrix over time. Thus, a feature tensor is created by concatenating the feature matrices associated to the recordings. A novel algorithm for non-negative tensor factorization (NTF), which employs the Frobenius norm between an n-dimensional raw feature tensor and its decomposition into a sum of elementary rank-1 tensors, is developed. Moreover, a supervised NTF classifier is proposed. A variety of sound description features are extracted from recordings from the GTZAN dataset, covering 10 genre classes. NTF classifier performance is compared against multilayer perceptrons, support vector machines, and non-negative matrix factorization classifiers. On average, genre classification accuracy equal to 75% with a standard deviation of 1% is achieved. It is demonstrated that NTF classifiers outperform matrix-based ones.
  • References (23)
    23 references, page 1 of 3

    [1] T. Lidy and A. Rauber, “Evaluation of feature extractors and psycho-acoustic transformations for music genre classification,” in Proc. 6th Int. Conf. Music Information Retrieval, pp. 34-41, September 2005.

    [2] M. I. Mandel, G. E. Poliner, and D. P. W. Ellis, “Support vector machine active learning for music retrieval,” Multimedia Systems, vol. 12, no. 1, pp. 3-13, 2006.

    [3] L. De Lathauwer, “Signal Processing Based on Multilinear Algebra”, Ph.D. Thesis, K.U. Leuven, E.E. Dept.- ESAT, Belgium, 1997.

    [4] MPEG-7, “Information Technology-Multimedia Content Description Interface-Part 4: Audio,” ISO/IEC JTC1/SC29/WG11 N5525, March 2003.

    [5] F. van der Hedjen, R. P. W. Duin, D. de Ridder, and D. M. J. Tax, Classification, Parameter Estimation and State Estimation, London UK: Wiley, 2004.

    [6] L. M. Bregman, “The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming,” USSR Computational Mathematics and Mathematical Physics, Vol. 7, pp. 200-217, 1967.

    [7] S. Sra and I. S. Dhillon, “Nonnegative matrix approximation: algorithms and applications,” Technical Report TR-06-27, Computer Sciences, University of Texas at Austin, 2006.

    [8] G. Tzanetakis and P. Cook, “Musical genre classification of audio signals,” IEEE Trans. Speech and Audio Processing, Vol. 10, No. 5, pp. 293-302, July 2002.

    [9] T. Li, M. Ogihara, and Q. Li, “A comparative study on content-based music genre classification,” in Proc. 26th Annual ACM Conf. Research and Development in Information Retrieval, pp. 282-289, July-August 2003.

    [10] E. Pampalk, A. Flexer, and G. Widmer, “Improvements of audio based music similarity and genre classification,” in Proc. 6th Int. Symp. Music Information Retrieval, pp. 628-633, 2005.

  • Metrics
    No metrics available
Share - Bookmark