Supervised Symbolic Music Style Translation Using Synthetic Data

descriptionPublicationkeyboard_double_arrow_right Conference object , Article , Preprint , Other literature type 01 Jan 2019Embargo end date: 01 Jan 2019 France Publisher:ZenodoJournal:CoRR, volume abs/1907.02265Funded by:EC | MIP-Frontiers

Authors: Cífka, Ondřej; Şimşekli, Umut; Richard, Gael;

doi: 10.5281/zenodo.3527878 , 10.5281/zenodo.3527877 , 10.48550/arxiv.1907.02265

arXiv: 1907.02265

Supervised Symbolic Music Style Translation Using Synthetic Data

- Summary
- Subjects
- Metrics

Abstract

Research on style transfer and domain translation has clearly demonstrated the ability of deep learning-based algorithms to manipulate images in terms of artistic style. More recently, several attempts have been made to extend such approaches to music (both symbolic and audio) in order to enable transforming musical style in a similar manner. In this study, we focus on symbolic music with the goal of altering the 'style' of a piece while keeping its original 'content'. As opposed to the current methods, which are inherently restricted to be unsupervised due to the lack of 'aligned' data (i.e. the same musical piece played in multiple styles), we develop the first fully supervised algorithm for this task. At the core of our approach lies a synthetic data generation scheme which allows us to produce virtually unlimited amounts of aligned data, and hence avoid the above issue. In view of this data generation scheme, we propose an encoder-decoder model for translating symbolic music accompaniments between a number of different styles. Our experiments show that our models, although trained entirely on synthetic data, are capable of producing musically meaningful accompaniments even for real (non-synthetic) MIDI recordings.

ISMIR 2019 camera-ready

Country

France

Related Organizations

INSTITUT POLYTECHNIQUE DE PARIS
France
Sciences Po
France
French National Centre for Scientific Research
France

Keywords

[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], FOS: Computer and information sciences, Computer Science - Machine Learning, Sound (cs.SD), [INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing, Machine Learning (stat.ML), [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], [INFO.INFO-SD] Computer Science [cs]/Sound [cs.SD], Computer Science - Sound, Machine Learning (cs.LG), [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], Statistics - Machine Learning, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Audio and Speech Processing

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average