# Neural Music Fingerprinting Dataset A realistic dataset for evaluating music fingerprints under various audio degradations. This data was used for the experiments in our ISMIR2025 paper 'Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification'. We share the following data in Zenodo:* music * training audio chunks (10,000 x 30 sec) * test queries (10,000 x 30 sec) (clean-time_shifted-degraded) * The time boundary of the chunks inside the full tracks. You can use to get the aligned, clean versions from the full tracks in the database. We couldn’t share the steps in between (clean, clean-time_shifted, clean-degraded) due to zenodo’s 50GB cap. * Full tracks of the queries (10,000 full tracks)* degradation audio The entire test database files take 400+ GB space, which can not be shared with Zenodo in a single repository. Therefore, you should download the FMA dataset and process them by following the steps in `dataset_creation/README` to get the entire database. Be sure to use the 10,000 database tracks that we include with this dataset. We included the full tracks of the query chunks so that the clean versions are exactly the same (During mp3 to wav conversion and processing sox may apply dithering, which is a stochastic process. Not sure about the effect of this, but ideally, master tracks should be the same.) All audio files have 8,000 Hz sampling rate and are in `.wav` format encoded with 16-bit LPCM. To decompress the tar ball: `tar -xJf neural-music-fp-dataset.tar.xz` Please cite the following publication when using the code, data, or the models: > R. O. Araz, G. Cortès-Sebastià, E. Molina, J. Serrà, X. Serra, Y. Mitsufuji, and D. Bogdanov, “Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification,” in Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR), 2025. ```bibtex@inproceedings{araz_enhancing_2025, title = {Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification}, author = {Araz, R. Oguz and Cortès-Sebastià, Guillem and Molina, Emilio and Serrà, Joan and Serra, Xavier and Mitsufuji, Yuki and Bogdanov, Dmitry}, booktitle = {Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR)}, year = {2025}}``` ## Directory Structure: ```.└── README.md└── degradation│ └── bg_noise│ │ └── test│ │ └── train│ └── microphone_ir│ │ └── test│ │ └── train│ └── room_ir│ │ └── test│ │ └── train└── music│ └── test-database-fma-ids.txt│ └── test-queries-fma-ids.txt│ └── train-fma-ids.txt│ └── test│ │ └── queries│ │ │ └── clean-time_shifted-degraded│ │ └── database│ └── train``` ## Data Sources - **Music**: https://github.com/mdeff/fma- **Background Noise Degradation**: - https://dcase-repo.github.io/dcase_datalist/datasets/scenes/tut_asc_2016_eval.html- **Room Impulse Response Degradation**: - https://www.openair.hosted.york.ac.uk/ - https://www.iks.rwth-aachen.de/en/research/tools-downloads/databases/aachen-impulse-response-database/ - https://mcdermottlab.mit.edu/Reverb/IR_Survey.html- **Microphone Impulse Response Degradation**: - https://zenodo.org/records/4633508 ## Recreation Please check the GitHub repository for a detailed README file into how the data splits were made, audio was processed, query audio was degraded. https://github.com/raraz15/neural-music-fp/blob/main/dataset_creation/README.md ## Queries Each .wav file is accompanied by a .npy with the same file stem. The numpy array contains the indices of the audio chunk's boundary inside the full track. These indices are usefull for:* segment-level evaluation* getting the clean version of the degraded audio from the full track* for reproducibility, you can use our clean query chunks but apply different degradations, this way musical diversity is fixed. ## License Each source indicated in the 'Data Sources' section assumes the license.* FMA: Each track is distributed under the license chosen by the artist.* DCASE TUT 2016: 'Free, free for academic usage (non-commercial), usually released under university specific EULA'* OpenAIR: All files have CC BY 4.0 license.* AIR: MIT license* MIT IR: License not specified.

Related Organizations

Universitat Pompeu Fabra
Spain

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Related to Research communities

EUTOPIA Open Research Portal