Automatic Music Transcription: Breaking the Glass Ceiling.

Conference object, Unknown OPEN
Emmanouil Benetos ; Simon Dixon ; Dimitrios Giannoulis ; Holger Kirchhoff ; Anssi Klapuri (2012)

Automatic music transcription is considered by many to be the Holy Grail in the field of music signal analysis. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. In order to overcome the limited performance of transcription systems, algorithms have to be tailored to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information across different methods and musical aspects.
  • References (45)
    45 references, page 1 of 5

    [1] S. A. Abdallah and M. D. Plumbley. Polyphonic transcription by non-negative sparse coding of power spectra. In ISMIR, pages 318-325, 2004.

    [2] S. Arberet, A. Ozerov, F. Bimbot, and R. Gribonval. A tractable framework for estimating and combining spectral source models for audio source separation. Signal Processing, 92(8):1886-1901, 2012.

    [3] A.M. Barbancho, A. Klapuri, L.J. Tardon, and I. Barbancho. Automatic transcription of guitar chords and fingering from audio. IEEE TASLP, 20(3):915-921, 2012.

    [4] J.G.A. Barbedo and G. Tzanetakis. Musical instrument classification using individual partials. IEEE TASLP, 19(1):111- 122, 2011.

    [5] M. Bay, A. F. Ehmann, and J. S. Downie. Evaluation of multiple-F0 estimation and tracking systems. In ISMIR, pages 315-320, 2009.

    [6] E. Benetos, A. Klapuri, and S. Dixon. Score-informed transcription for automatic piano tutoring. In EUSIPCO, 2012.

    [7] N. Bertin, R. Badeau, and E. Vincent. Enforcing harmonicity and smoothness in Bayesian non-negative matrix factorization applied to polyphonic music transcription. IEEE TASLP, 18(3):538-549, 2010.

    [8] J.C. Brown. Calculation of a constant Q spectral transform. JASA, 89(1):425-434, 1991.

    [9] J. B. Buckheit and D. L. Donoho. WaveLab and reproducible research. Technical Report 474, Dept of Statistics, Stanford Univ., 1995.

    [10] A. Dessein, A. Cont, and G. Lemaitre. Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence. In ISMIR, pages 489-494, 2010.

  • Metrics
    No metrics available
Share - Bookmark