N20EM dataset for multimodal lyric transcription

N20EM dataset for multimodal lyric transcription, proposed in our ACM MM 2022 paper, MM-ALT: A Multimodal Automatic Lyric Transcription System. This dataset contains recordings of three modalities: audio, video, and IMU motion signal. Our paper's camera ready version: https://arxiv.org/abs/2207.06127 Project website: https://n20em.github.io/ Note: Once you download the dataset, we assume you have read and agreed with the Terms and Conditions. Commercial usage is strictly prohibited. Please cite our work as: @inproceedings{gu2022mm, title={MM-ALT: A multimodal automatic lyric transcription system}, author={Gu, Xiangming and Ou, Longshen and Ong, Danielle and Wang, Ye}, booktitle={Proceedings of the 30th ACM International Conference on Multimedia}, pages={3328--3337}, year={2022} }

Add data for VAD training and accompaniment of each utterance.

Related Organizations

National University of Singapore
Singapore

Keywords

lyric transcription, music information retrieval, multimodal

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility	views	70
download	downloads	12

70
views
12
downloads
Powered by

Found an issue? Give us feedback

visibility

download

0

Average

70

12