
TAPS (Throat and Acoustic Paired Speech) is a paired speech corpus for deep learning–based speech enhancement, providing synchronized recordings from an accelerometer-based throat microphone and an acoustic microphone. Contents:- 60 native Korean speakers, gender-balanced (50/50)- Total: 6,000 utterances, ~15.3 hours- Splits: train (4,000 utt, 40 speakers), dev (1,000 utt, 10 speakers), test (1,000 utt, 10 speakers)- No speaker overlap across splits Files:This Zenodo record contains ZIP archives for each split. The training split is provided as four ZIP files due to upload size constraints:- TAPS_data_train_1.zip- TAPS_data_train_2.zip- TAPS_data_train_3.zip- TAPS_data_train_4.zip- TAPS_data_dev.zip- TAPS_data_test.zip To use the full training set, download and extract all four training ZIP files into the same target directory. Inside each ZIP:///- throat_microphone.wav (paired throat signal)- acoustic_microphone.wav (paired acoustic signal)- features.json (metadata) Metadata fields (features.json):- gender, speaker_id, sentence_id, duration- text: original transcription- normalized_text: normalized transcription (numbers spelled out in Korean; punctuation normalized)- throat_microphone/acoustic_microphone: sampling_rate, num_samples, etc. Use cases:- throat-microphone speech enhancement (recovering attenuated high-frequency components)- multimodal speech processing and related tasks Project homepage and an alternative distribution (different file format) are provided in Related works. The accompanying paper is available on arXiv.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
