Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset
Data sources: ZENODO
addClaim

mshoxxDB - a Versioned Dataset for Electronic Music

Authors: Taenzer, Michael;

mshoxxDB - a Versioned Dataset for Electronic Music

Abstract

This dataset was presented as a Late Breaking Demo at ISMIR 2024 in San Francisco, CA, including the paper (as an extended abstract), poster, and demo video. It was initially studied in this EURASIP Journal article. The dataset is listed in the ISMIR Resources (pulled from here). DescriptionmshoxxDB is an open-source dataset for research in Music Information Retrieval (MIR), with a focus on Electronic Music. It was created by Michael Taenzer in the Reason Studios digital audio workstation (DAW). The dataset provides comprehensively annotated music audio data for a genre that has received comparatively limited attention in MIR research. With its combination of diverse synthetic timbres, classical instruments, and multitrack material, it supports tasks such as instrument detection, multi-pitch estimation, source separation, beat detection, and tempo estimation. It is particularly well suited for evaluating instrument-agnostic methods and model generalization. The music covers several sub-genres of Electronic Music, including video game music, 8-bit (chiptune), EDM, pop, house, and chillout/dreamy styles. For more info, please refer to "README.txt" contained in the archive. Contents- 18 full-length pieces of music, 61 minutes of audio in total- mixtures and multitrack stems in FLAC format (44.1 kHz, 16-bit, mono, compression level 6)- track-level MIDI files- CSV metadata including, among others: genres, tempo/bpm, time signature, original composer and artist information- ms12 and ms14 dataset splits in JSON format, as described in the initial study (see above) Technical PropertiesNot all mixtures are exact sums of their corresponding multitrack stems. Some mixtures may contain additional processing in the form of limiting and compression, e.g. applied to the full mix or through side-chain compression between tracks. No harmonic effects were added to the mixtures, such as reverb, echo, or delay, as these would introduce additional harmonic content and could lead to mismatches between MIDI and audio. Demo Page & RepositoryA demo page with selected listening examples is available on GitHub Pages: https://mic-tae.github.io/mshoxxdb/. The mshoxxDB repository is located at https://github.com/mic-tae/mshoxxdb. The canonical archived release of mshoxxDB is hosted here on Zenodo. The GitHub repository and demo page provide supplementary documentation, examples, and project-related resources. LicenseThis dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license (CC BY-NC-SA 4.0). See "LICENSE.txt" for the full license terms. CitationIf you use this dataset in your work, please cite it as follows (bibtex) (the first option is preferred): @misc {taenzer:mshoxxDB:2024, author = {Taenzer, Michael}, title = {{mshoxxDB - a Versioned Dataset for Electronic Music}}, booktitle = {{Late-Breaking and Demo Session of the 25th International Conference on Music Information Retrieval (ISMIR)}}, address = {{San Francisco, CA, USA}}, year = {2024},} @dataset{taenzer:mshoxxDB:2024, author = {Taenzer, Michael}, title = {mshoxxDB - a Versioned Dataset for Electronic Music}, year = {2024}, publisher = {Zenodo}, doi = {10.5281/zenodo.15881577}, url = {https://doi.org/10.5281/zenodo.15881577},} For methodological details and the initial study of the dataset, please also refer to the accompanying journal article and ISMIR 2024 late-breaking demo contribution. Future versionsFuture versions of mshoxxDB may include additional music, segmentation annotations for each piece, automation information of synthesizer parameters in the MIDIs, and possibly stereo audio data. CommunityContributions to this dataset are welcome, for example through additional music, annotations, metadata improvements, or other suggestions that could help improve mshoxxDB. Changelog Version 1.2 (15 April 2026)- added "metadata.csv" back into the archive- substantially restructured "metadata.csv": - new columns: "piece_id", "artist", "bpm_min", "bpm_max" - renamed columns: "genre" -> "genres", "length" -> "duration_seconds", "timesig" -> "time_signature" - dropped column: "tempo"- json files now use "piece_id" instead of filenames as identifier- changed all remaining umlauts "ü" -> "ue"- substantially extended "README.txt"- small changes to "LICENSE.txt" Version 1.1 (16 July 2025)- renamed all files to reflect main DB version number v1- changed umlaut "ü" from "Güte" -> "Guete"- added dataset splits "ms12" (1 json) and "ms14" (3 jsons) as described and used in https://doi.org/10.1186/s13636-025-00398-2- added LICENSE file- added README file Version 1 (14 November 2024)- Initial release

Powered by OpenAIRE graph
Found an issue? Give us feedback