<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

PartialSpoof Database - Partially Spoofed Audio Dataset for Anti-spoofing

Name: PartialSpoof Database - Partially Spoofed Audio Dataset for Anti-spoofing
Keywords: anti-spoofing, deepfakes, multi-level spoof detection, partially-spoofed attack, PartialSpoof

appsOther research productkeyboard_double_arrow_right Audiovisual 27 May 2021Publisher:Zenodo

Authors: Zhang, Lin; Wang, Xin; Cooper, Erica; Yamagishi, Junichi; Patino, Jose; Evans, Nicholas;

doi: 10.5281/zenodo.5112031 , 10.5281/zenodo.4817531 , 10.5281/zenodo.5766198 , 10.5281/zenodo.4817532

PartialSpoof Database - Partially Spoofed Audio Dataset for Anti-spoofing

- Summary
- Subjects
- Related research
  (2)
- Metrics

Abstract

All existing databases of spoofed speech contain attack data that is spoofed in its entirety. In practice, it is entirely plausible that successful attacks can be mounted with utterances that are only partially spoofed. By definition, partially-spoofed utterances contain a mix of both spoofed and bona fide segments, which will likely degrade the performance of countermeasures trained with entirely spoofed utterances. This hypothesis raises the obvious question: ‘Can we detect partially spoofed audio?’ This paper introduces a new database of partially-spoofed data, named PartialSpoof, to help address this question. This new database enables us to investigate and compare the performance of countermeasures on both utterance- and segmental- level labels. Experimental results using the utterance-level labels reveal that the reliability of countermeasures trained to detect fully-spoofed data is found to degrade substantially when tested with partially-spoofed data, whereas training on partially-spoofed data performs reliably in the case of both fully- and partially- spoofed utterances. Additional experiments using segmental-level labels show that spotting injected spoofed segments included in an utterance is a much more challenging task even if the latest countermeasure models are used. For the initial version of PartialSpoof v1.0 Arxiv: https://arxiv.org/abs/2104.02518 Samples: https://nii-yamagishilab.github.io/zlin-demo/IS2021/index.html PartialSpoof Database v1.0: https://zenodo.org/record/4817532 For the multi-task version of PartialSpoof v1.1 Arxiv: https://arxiv.org/abs/2107.14132 PartialSpoof Database v1.1 (including segmental level labels): https://zenodo.org/record/5112031 P.S. 1. Compared to the PartialSpoof_v1.0, only database_segment_labels.tar.gz and README_v1.1 are updated for version 1.1, you don't need to download other files if you already downloaded version1.0. 2. File database_eval.tar.gz is a little large, if you cannot download it smoothly, you can download the split database_eval.tar.gz from PartialSpoof_v1.0

{"references": ["Zhang, L., Wang, X., Cooper, E., Yamagishi, J., Patino, J., & Evans, N. (2021). An Initial Investigation for Detecting Partially Spoofed Audio. arXiv preprint arXiv:2104.02518.", "Wang, X., Yamagishi, J., Todisco, M., Delgado, H., Nautsch, A., Evans, N., ... & Ling, Z. H. (2020). ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. Computer Speech & Language, 64, 101114.", "Zhang, L., Wang, X., Cooper, E., & Yamagishi, J. (2021). Multi-Task Learning in Utterance-Level and Segmental-Level Spoof Detection. arXiv preprint arXiv:2107.14132."]}

This database was partially supported by the Japanese-French joint national VoicePersonae project supported by JST CREST (JPMJCR18A6) and the ANR (ANR-18-JSTS-0001), JST CREST Grants (JPMJCR20D3), MEXT KAKENHI Grants (16H06302, 18H04120, 18H04112, 18KT0051), Japan, and Google AI for Japan program.

Related Organizations

National Institute of Informatics (NII)
Japan

Keywords

anti-spoofing, deepfakes, multi-level spoof detection, partially-spoofed attack, PartialSpoof

2 Research products, page 1 of 1

An Initial Investigation for Detecting Partially Spoofed Audio
2021IsAmongTopNSimilarDocuments
The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance
2023IsAmongTopNSimilarDocuments

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average