Data-driven Pitch Content Description of Choral Singing Recordings

descriptionPublicationkeyboard_double_arrow_right Thesis , Other literature type , Doctoral thesis 01 Jan 2022 Spain English Publisher:ZenodoFunded by:EC | TROMPA

Authors: Helena Cuesta;

doi: 10.5281/zenodo.6389642 , 10.5281/zenodo.6389643

handle: 10803/673924

Data-driven Pitch Content Description of Choral Singing Recordings

- Summary
- Subjects
- Metrics

Abstract

Cantar en un conjunt vocal és una activitat arrelada a moltes cultures i que es desenvolupa en diversos formats, idiomes i nivells. Tanmateix, la falta de les dades adequades ha fet que no s’hagi estudiat extensivament en el camp de la Recuperació de la Informació Musical (MIR). En aquesta tesi, primer abordem l’escassetat de dades creant noves bases de dades obertes amb gravacions multi-pista de conjunts vocals. Tot seguit, ens centrem principalment en tres tasques d'investigació: estimació i seguiment de múltiples valors de F0, assignació de veus i modelat d’unísons, totes en el context de grups vocals a quatre veus. Per tant, la primera aportació d’aquesta tesi és la publicació de quatre bases de dades amb enregistraments de conjunts vocals: Choral Singing Dataset, Dagstuhl ChoirSet, ESMUC Choir Dataset i Cantoría Dataset, totes amb enregistraments d’àudio multi-pista i anotacions. La segona aportació d’aquesta tesi és un conjunt de models d’aprenentatge profund per l’estimació i el seguiment de múltiples valors de F0 i per l’assignació de veus en quartets vocals, principalment basats en xarxes neuronals convolucionals dissenyades per incorporar coneixement musical. Finalment, proposem dos mètodes per modelar i caracteritzar unísons vocals en termes de dispersió d’altura tonal (pitch).

Ensemble singing is a well-established practice across cultures, found in a great diversity of forms, languages, and levels. However, it has not been widely studied in the field of Music Information Retrieval (MIR), likely due to the lack of appropriate data. In this dissertation, we first address the data scarcity by building new open, multi-track datasets of ensemble singing. Then, we address three main research problems: multiple F0 estimation and streaming, voice assignment, and the characterization of vocal unisons, all in the context of four-part vocal ensembles. Hence, the first contribution of this thesis is the development and release of four multi-track datasets of vocal ensembles: Choral Singing Dataset, Dagstuhl ChoirSet, ESMUC Choir Dataset, and Cantoría Dataset, all of them with audio recordings and accompanying annotations. The second contribution is a set of deep learning models for multiple F0 estimation, streaming, and voice assignment of vocal quartets, mainly based on convolutional neural networks designed leveraging music domain knowledge. Finally, we propose two methods to characterize vocal unison performances in terms of pitch dispersion.

Programa de doctorat en Tecnologies de la Informació i les Comunicacions

Country

Spain

Keywords

62, music information retrieval, Singing, Choral singing, open data, Dades obertes, Estimació de múltiples freqüències, Voice assignment, Automatic music transcription, unison, Uníson, Música vocal, multi-pitch estimation, singing, vocal music, Open data, Cant coral, Unison, MIR, Transcripció automàtica de música, Vocal music, choral singing, Assignació de veus, Multi-pitch estimation, automatic music transcription, voice assignment, Cant

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average