YouTube8M-MusicTextClips

YouTube8M-MusicTextClips Dataset This page includes the YouTube8M-MusicTextClips dataset from our CVPR 2023 paper: Language-Guided Music Recommendation for Video via Prompt Analogies Daniel McKee1, Justin Salamon2, Josef Sivic2,3, Bryan Russell2 1University of Illinois at Urbana-Champaign, 2Adobe Research, 3Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University The dataset is licensed under a Research-only, non-commercial Adobe Research License. Please see our attached LICENSE file for more information. Dataset Description The YouTube8M-MusicTextClips dataset consists of over 4k high-quality human text descriptions of music found in video clips from the YouTube8M dataset. For each selected YouTube music video, we extracted 10 second clips at the middle of the video for annotation. We provided annotators with only the audio corresponding to this clip. Thus, text annotations describe audio alone, not the visual content of the clip. The dataset annotations are divided into train and test split files. As the dataset is meant mainly for evaluation, there are 3169 annotated clips from the test set and only 1000 annotated clips from the train set. Each file contains the following information for each sample: video_id: The YouTube ID corresponding to the video containing an annotated clip start: Start time (in seconds) of the annotated clip in the video end: End time (in seconds) of the annotated clip in the video text: The text annotation describing the music from the annotated clip For more information, please check our project page and paper: https://www.danielbmckee.com/language-guided-music-for-video/ Citation If you use this dataset, please cite our paper: McKee, D., Salamon, J., Sivic, J., & Russell, B. (2023). Language-Guided Music Recommendation for Video via Prompt Analogies. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023). Bibtex: @InProceedings{mckee2023language, author = {McKee, Daniel and Salamon, Justin and Sivic, Josef and Russell, Bryan}, title = {Language-Guided Music Recommendation for Video via Prompt Analogies}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2023}, }

Related Organizations

University of Illinois at Urbana Champaign
United States
Czech Technical University in Prague
Czech Republic

Keywords

YouTube8M-MusicTextClips, music descriptions, music captions, machine learning, human annotations, YouTube8M

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility	views	607
download	downloads	200

607
views
200
downloads
Powered by

Found an issue? Give us feedback

visibility

download

0

Average

607

200