Downloads provided by UsageCounts
N20EM dataset for multimodal lyric transcription, proposed in our ACM MM 2022 paper, MM-ALT: A Multimodal Automatic Lyric Transcription System. This dataset contains recordings of three modalities: audio, video, and IMU motion signal. Our paper's camera ready version: https://arxiv.org/abs/2207.06127 Project website: https://n20em.github.io/ Note: Once you download the dataset, we assume you have read and agreed with the Terms and Conditions. Commercial usage is strictly prohibited. Please cite our work as: @inproceedings{gu2022mm, title={MM-ALT: A multimodal automatic lyric transcription system}, author={Gu, Xiangming and Ou, Longshen and Ong, Danielle and Wang, Ye}, booktitle={Proceedings of the 30th ACM International Conference on Multimedia}, pages={3328--3337}, year={2022} }
Add data for VAD training and accompaniment of each utterance.
lyric transcription, music information retrieval, multimodal
lyric transcription, music information retrieval, multimodal
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 70 | |
| downloads | 12 |

Views provided by UsageCounts
Downloads provided by UsageCounts