
doi: 10.1121/10.0019176
Sound event recognition is the task of identifying and categorizing sounds in audio data. Automated algorithms for sound event recognition depend on having explicit models for individual sound event types to be recognized, which are trained on data tagged explicitly for those classes. The approach is data hungryand is fundamentally limited by the number of classes for which such data may be obtained. It also ignores the relationship between sounds being modeled. In this work, we attempt to address these deficiencies through the use of a human-generated sound ontology which represents sibling and parent–child relations between sound classes. We incorporate the relationships in the ontology through the design of an appropriate “loss” function (the objective function optimized to train sound-classifier models) that incorporates the relationships in the ontology, and through appropriate model update rules which utilize data from a class to update parameters (of both ontological siblings and parents). Through experiments run on the “Audioset” (a popular, large-scale dataset of 600 sound categories), we find that better-performing models can be trained for sound classes with a given dataset, and that the amount of new data required to train models for a novel sound class can be significantly reduced.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
