Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems

Name: Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems
Keywords: FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Cryptography and Security, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Cryptography and Security (cs.CR), Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing

Zheng Fang; Tao Wang; Lingchen Zhao; Shenyi Zhang; Bowen Li; Yunjie Ge; Qi Li; Chao Shen; Qian Wang

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2024

Data sources: arXiv.org e-Print Archive

https://doi.org/10.1145/365864...

Article . 2024 . Peer-reviewed

License: https://www.acm.org/publications/policies/copyright_policy#Background

Data sources: Crossref

https://dx.doi.org/10.48550/ar...

Article . 2024

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 02 Dec 2024Embargo end date: 01 Jan 2024Publisher:ACMJournal:Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

Authors: Zheng Fang; Tao Wang; Lingchen Zhao; Shenyi Zhang; Bowen Li; Yunjie Ge; Qi Li; +2 Authors

doi: 10.1145/3658644.3670309 , 10.48550/arxiv.2406.19311

arXiv: 2406.19311

Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

In recent years, extensive research has been conducted on the vulnerability of ASR systems, revealing that black-box adversarial example attacks pose significant threats to real-world ASR systems. However, most existing black-box attacks rely on queries to the target ASRs, which is impractical when queries are not permitted. In this paper, we propose ZQ-Attack, a transfer-based adversarial attack on ASR systems in the zero-query black-box setting. Through a comprehensive review and categorization of modern ASR technologies, we first meticulously select surrogate ASRs of diverse types to generate adversarial examples. Following this, ZQ-Attack initializes the adversarial perturbation with a scaled target command audio, rendering it relatively imperceptible while maintaining effectiveness. Subsequently, to achieve high transferability of adversarial perturbations, we propose a sequential ensemble optimization algorithm, which iteratively optimizes the adversarial perturbation on each surrogate model, leveraging collaborative information from other models. We conduct extensive experiments to evaluate ZQ-Attack. In the over-the-line setting, ZQ-Attack achieves a 100% success rate of attack (SRoA) with an average signal-to-noise ratio (SNR) of 21.91dB on 4 online speech recognition services, and attains an average SRoA of 100% and SNR of 19.67dB on 16 open-source ASRs. For commercial intelligent voice control devices, ZQ-Attack also achieves a 100% SRoA with an average SNR of 15.77dB in the over-the-air setting.

To appear in the Proceedings of The ACM Conference on Computer and Communications Security (CCS), 2024

Related Organizations

Tsinghua University
China (People's Republic of)
Wuhan University
China (People's Republic of)
Xi'an Jiaotong University
China (People's Republic of)
Xi’an Jiaotong-Liverpool University
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Cryptography and Security, Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Cryptography and Security (cs.CR), Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing

1 Research products, page 1 of 1

whisper software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	10
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

10

Top 10%

Green

Related to Research communities

UArctic

Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems

Zero-Query Adversarial Attack on Black-box Automatic Speech Recognition Systems

1 Research products, page 1 of 1

whisper software on GitHub