
Introduction DnR-nonverbal is a dataset for cinematic audio source separation (CASS) based on Divide and Remaster (DnR) dataset. Unlike conventional datasets, our dataset contains non-verbal sounds such as laughter and screaming, just like actual movie audio. Our dataset enables CASS models to allocate non-verbal sounds to the same stem as speech. Examples of clips and separation results are available at https://tky823.github.io/hasumi2025dnr.github.io/ How to Use Download dnr-nonverbal.tar.gz from this page. Extract dnr-nonverbal.tar.gz by tar xvzf dnr-nonverval.tar.gz (optional) Mix directories with the DnR. Our sample IDs are assigned in such a way that they do not duplicate DnR. Dataset Structure The dataset structure is based on DnR, except that our dataset contains non-verbal sounds as a part of the speech stem. dnr-nonverbal ├── tr │ ├── 100009 │ │ ├── annots.csv │ │ ├── background.wav │ │ ├── foreground.wav │ │ ├── mix.wav │ │ ├── music.wav │ │ ├── nonverbal.wav │ │ ├── reading.wav │ │ ├── sfx.wav │ │ └── speech.wav │ ├── 100031 │ ... ├── cv └── tt reading.wav: Reading style speech extracted from LibriSpeech. nonverbal.wav: Non-verbal sounds collected from FSD50K and newly crawled from FreeSound. speech.wav: Mixture of reading style speech and non-verbal sounds. music.wav: Background music extracted from FMA (medium). foreground.wav: Foreground effect sounds collected from FSD50K. background.wav: Background effect sounds collected from FSD50K. sfx.wav: Foreground and background effect sounds. annots.csv: A metadata file that identifies sources of sounds. Citation @inproceedings{hasumi25_interspeech, title= {{DnR-nonverbal: Cinematic audio source separation dataset containing non-verbal sounds}}, author={Takuya Hasumi and Yusuke Fujita}, year= {2025}, booktitle = {Interspeech 2025}, pages= {4993--4997}, doi= {10.21437/Interspeech.2025-1148}, issn={2958-1796},}
non-verbal sound, audio source separation, cinematic audio source separation
non-verbal sound, audio source separation, cinematic audio source separation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
