Downloads provided by UsageCounts
Automatic music transcription aims to extract a musical score from a given audio signal. Conventional machine learning frameworks usually address this task by relying solely on error back-propagation from annotated MIDI data, without consideration for acoustic similarities. In this study, we complement the onset and frames prediction objective with an acoustic distance, through differentiable rendering of the estimated piano-roll and approximate reconstruction of the analyzed signal. We apply our method to piano and show that this added reconstruction error improves the performance achieved with the usual supervised transcription loss. Moreover, using solely this acoustic criterion allows fully unsupervised training and results outperforming classical techniques. Finally, our method also enables performing automatic instrument transposition by using audio samples of a different instrument from the original sound source when reconstructing the input signal.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 5 | |
| downloads | 3 |

Views provided by UsageCounts
Downloads provided by UsageCounts