Impact of Time and Note Duration Tokenizations on Deep Learning Symbolic Music Modeling

descriptionPublicationkeyboard_double_arrow_right Article , Conference object , Preprint 01 Jan 2023Embargo end date: 01 Jan 2023 France Publisher:ISMIRJournal:CoRR, volume abs/2310.08497

Authors: Fradet, Nathan; Gutowski, Nicolas; Chhel, Fabien; Briot, Jean-Pierre;

doi: 10.5281/zenodo.10265229 , 10.48550/arxiv.2310.08497 , 10.5281/zenodo.10265228

arXiv: 2310.08497

Impact of Time and Note Duration Tokenizations on Deep Learning Symbolic Music Modeling

- Summary
- Subjects
- Metrics

Abstract

Symbolic music is widely used in various deep learning tasks, including generation, transcription, synthesis, and Music Information Retrieval (MIR). It is mostly employed with discrete models like Transformers, which require music to be tokenized, i.e., formatted into sequences of distinct elements called tokens. Tokenization can be performed in different ways. As Transformer can struggle at reasoning, but capture more easily explicit information, it is important to study how the way the information is represented for such model impact their performances. In this work, we analyze the common tokenization methods and experiment with time and note duration representations. We compare the performances of these two impactful criteria on several tasks, including composer and emotion classification, music generation, and sequence representation learning. We demonstrate that explicit information leads to better results depending on the task.

ISMIR 2023

Country

France

Related Organizations

Sorbonne Université
France
French National Centre for Scientific Research
France
Sorbonne University
France
École Supérieure d'Electronique de l'Ouest
France
UNIVERSITE D'ANGERS
France

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Sound (cs.SD), Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, [INFO] Computer Science [cs], Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing, Machine Learning (cs.LG)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green