<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Overview This is the DIFFRIR dataset, released as part of "Hearing Anything Anywhere" (CVPR, 2024). It contains a dataset of monoaural and binaural Room Impulse Responses (RIRs) and music recordings in four different rooms: A Classroom An acoustically Dampened Room A Hallway A Complex Room (shaped like a pentagonal prism with several irregular surfaces of varying materials and pillars) In each room, we measure RIRs using both monoaural and binaural microphones at several hundred precisely-measured locations. We record different music recordings from these locations as well. We also record several configurations in the Dampened Room, the Hallway, and the Complex Room, where we rotate or translate the location of the speaker used to measure the RIR, or insert one or more whiteboard panels at different locations in the room. Each of these configurations also contains measurements of monoaural and binaural RIRs and music. Organization There are 14 .zip files in this dataset, containing data from the 14 room configurations across the four rooms. Zip files with the suffix "base" contain data from the base configuration, and hallwayPanel1.zip, hallwayPanel2.zip, and hallwayPanel3.zip contain data from each of the three panel configurations in the hallway. Files All data is stored as .npy files. All audio data is stored as numpy float64 arrays with a sampling rate of 48,000 Hz. All audio files are aligned such that the time the recording starts is equal to the time that the speaker begins playing the music or hypothetical impulse. In the descriptions below, we specify the contents and shape of each file. N_mono is the number of monoaural data points, and N_binaural is the number of binaural data points recorded in the room configuration. In each .zip file, we have 5 files for each monoaural data point: RIRs.npy - (N_mono_data, 671,884) - monoaural room impulse responses measured for the room's configuration. music.npy - (N_mono_data, N_songs, 623,884) - monoaural music recordings, measured at the same locations as RIRs.npy. N_songs is either 1 or 5. xyzs.npy - (N_mono_data, 3) - the xyz microphone locations in meters at which RIRs.npy and music.npy were recorded. music_dls.npy - (N_mono_data, N_songs, 624000) - music source files for each of the monoaural music data points. The source is measured by recording a loopback signal, and is aligned such that convolving a source from music_dls.npy with the corresponding RIR in RIRs.npy estimates the corresponding music recording in music.npy. mic_numbers.npy - (N_mono,) - an array of integers identifying the microphone of each recording in RIRs.npy and music.npy. Each .zip file also contains 4 files for each binaural data point: bin_RIRs.npy - (N_binaural_data, 2, 671,884) - binaural room impulse responses measured for the room's configuration. bin_music.npy - (N_binaural_data, N_songs, 2, 623,884) - binaural music recordings, measured at the same locations as bin_xyzs.npy. N_songs is 5. bin_xyzs.npy - (N_binaural_data, 3) - the xyz locations in meters at which each row in bin_RIRs.npy and bin_music.npy was recorded. The location is the center of the binaural microphone. To get the location of the left microphone, add 5.5 cm to the x location, and to get the location of the right microphone, subtract 5.5 cm from the x location (the binaural microphone was facing in the -y direction in all rooms/configurations). bin_music_dls.npy - (N_binaural_data, N_songs, 624000) - music sources for each binaural data point, measured via direct line loopback. Room Geometry, Speaker Location The geometric measurements of the surfaces in the room are provided in the rooms/ folder on the github, including speaker locations and the locations of all surfaces. Microphone Calibrations All monoaural microphone recordings were perfomed using EMM6 measurement microphones. They have already been adjusted to account for differences in microphone sensitivity according to the microphone's sensitivity at 1000 Hz. If you would like to perform more fine-grained microphone frequency calibration, the EMM6 calibration files for each dataset are included in mic_calibrations.zip. This folder contains four subfolders corresponding to each room. In each subfolder, we included the microphone calibration files .txt files, whose titles are {mic_id}_{mic_serial_number}.txt. {mic_id} corresponds to the microphone ID number provided in mic_numbers.npy for each monoaural data point. For more information on microphone calibration files, refer to this website.
sound, audio, room acoustics, music, acoustics, signal processing, room impulse responses
sound, audio, room acoustics, music, acoustics, signal processing, room impulse responses
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |