MIR-1K dataset

MIR-1K Dataset Multimedia Information Retrieval lab, 1000 song clips, dataset for singing voice separation Work by Chao-Ling Hsu and Prof. Jyh-Shing Roger Jang The MIR-1K dataset is designed for the research of singing voice separation. MIR-1K contains: 1000 song clips which the music accompaniment and the singing voice are recorded at left and right channels, respectively. Manual annotations of the dataset include pitch contours in semitone, indices and types for unvoiced frames, lyrics, and vocal/non-vocal segment. The speech recordings of the lyrics by the same person who sang the songs are also provided in the dataset. The undivided songs of MIR-1K are now available for downloading. The song clip is named in the form "SingerId_SongId_ClipId". The duration of each clip ranges from 4 to 13 seconds, and the total length of the dataset is 133 minutes. These clips are extracted from 110 karaoke songs which contain a mixture track and a music accompaniment track. These songs are freely selected from 5000 Chinese pop songs and sung by our labmates of 8 females and 11 males. Most of the singers are amateur and do not have professional music training. Labels for the unvoiced sounds In MIR-1K, all frames of each clip are manually labeled as one of the five sound classes: unvoiced stop unvoiced fricative and affricate /h/ inhaling sound others (include voiced sound and music accompaniment) The length and the shift of the frame are 40 ms and 20 ms, respectively. Sound demos for the unvoiced singing voice separation Sound Demos for Unvoiced Singing Voice Separation Download MIR-1K dataset http://mirlab.org/dataset/public/MIR-1K.rar Download MIR-1K dataset for MIREX http://mirlab.org/dataset/public/MIR-1K_for_MIREX.rar Relevant publications [1] Chao-Ling Hsu, DeLiang Wang, Jyh-Shing Roger Jang, and Ke Hu, “ A Tandem Algorithm for Singing Pitch Extraction and Voice Separation from Music Accompaniment,” IEEE Trans. Audio, Speech, and Language Processing, 2011 (Accepted) [2] Chao-Ling Hsu and Jyh-Shing Roger Jang, “On the Improvement of Singing Voice Separation for Monaural Recordings Using the MIR-1K Dataset,” IEEE Trans. Audio, Speech, and Language Processing, volume 18, issue 2, p.p 310-319, 2010. [3] Chao-Ling Hsu, DeLiang Wang, and Jyh-Shing Roger Jang, “A Trend Estimation Algorithm for Singing Pitch Detection in musical Recordings”, IEEE International Conference on Acoustics, Speech and Signal Processing, Prague, Czech, Mar. 2011. [4] Chao-Ling Hsu, Liang-Yu Chen, Jyh-Shing Roger Jang and Hsing-Ji Li, “Singing Pitch Extraction From Monaural Polyphonic Songs By Contextual Audio Modeling and Singing Harmonic Enhancement”, International Society for Music Information Retrieval, Kobe, Japan, Oct. 2009. [5] Chao-Ling Hsu and Jyh-Shing Roger Jang, “Singing Pitch Extraction by Voice Vibrato/Tremolo Estimation and Instrument Partial Deletion”, International Society for Music Information Retrieval, Utrecht, Netherlands, Aug. 2010.

Keywords

Vocal detection, Singing voice separation

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average