Evolutionary Black-Box Adversarial Attacks

Despite the successful use of Deep Neural Networks (DNNs) in diverse real-world scenarios, recent research has shown that even state-of-the-art networks face robustness issues as they are vulnerable to adversarial examples. Adversarial examples are manipulated inputs, generated to subvert a neural network's outputs with small alterations in the original input, where the network serves as the target model. An adversarial example is crafted during an attack, which can happen in white-box scenarios, where the adversary has extensive knowledge of the target model, or in black-box scenarios, where the adversary only has access to the target model's outputs. Image-specific adversarial attacks craft a unique perturbation for each original input and have been thoroughly explored in the literature. Evolutionary computation has risen as a successful strategy in generating adversarial examples in black-box attacks, as the attack can be modeled as an optimization problem. Previous research has shown the existence of Universal Adversarial Perturbation (UAP), which is a single perturbation capable of turning multiple original inputs into adversarial examples. Although white-box and black-box universal attacks are present in the literature, the use of evolutionary strategies for black-box attacks is not yet thoroughly explored. To study the use of evolutionary algorithms in image-specific black-box attacks, this work explores the one-pixel attack, an attack that introduces changes to a single pixel of the input image to make the classifier predict erroneously. A pragmatic approach is used to implement and compare the attack conducted with three evolutionary algorithms — Differential Evolution (DE), Genetic Algorithm (GA), and Covariance Matrix Adaptation Evolution Strategy (CMA-ES). Overall, the experimentation provided insights into the nuances of the one-pixel attack and compared three standard evolutionary algorithms, showcasing each algorithm's potential and the ability of evolutionary computation to find solutions in this strict case of the adversarial attack. After a thorough literature review and experiments with image-specific evolutionary black-box attacks, this dissertation proposes a novel evolutionary black-box data-driven adversarial attack for the generation of UAPs in non-targeted scenarios, with an adaptation to class-specific scenarios. The attack uses a customization of TensorGP's Genetic Programming (GP) to evolve patches and create adversarial examples. Experiments were conducted using a benchmark dataset with L∞ norm thresholds set to 3%, 5%, and 10% to limit perturbation size. Then, the method was validated by comparison with a state-of-the-art adversarial attack.

Apesar do sucesso de Redes Neuronais Profundas em aplicações no mundo real, pesquisas recentes mostram que mesmo as redes com melhor performance enfrentam problemas de robustez, evidenciados pela sua vulnerabilidade a exemplos adversariais. Exemplos adversariais são inputs manipulados para subverter os outputs de uma rede neuronal, criados com pequenas alterações ao input original. Um exemplo adversarial é elaborado durante um ataque, que pode ocorrer em cenários de caixa-branca, onde o adversário tem amplo conhecimento sobre a rede alvo, ou em cenários de caixa-negra, onde o adversário só tem acesso aos outputs da rede alvo. Os ataques específicos de imagem criam uma perturbação para cada input e foram exaustivamente explorados na literatura. A computação evolucionária é uma estratégia popular e bem sucedida para a geração de exemplos adversariais em cenáriros de caixa-negra, uma vez que o ataque pode ser modelado como um problema de otimização. Estudos mostram a existência de Perturbações Adversariais Universais (UAPs): uma perturbação única capaz de transformar múltiplos inputs em exemplos adversariais. Embora os ataques universais de caixa-branca e de caixa-negra estejam presentes na literatura, o uso de estratégias evolucionárias para ataques de caixa-negra ainda não é uma área completamente explorada. A fim de estudar a aplicação de algoritmos evolucionários em ataques específicos de imagem, este trabalho explora o one-pixel attack, um ataque que introduz alterações nos pixels das imagens de input para enganar a rede. Uma abordagem pragmática é usada para realizar o ataque com três algoritmos evolucionários – Evolução Diferencial, Algoritmo Genético, e Estratégia de Evolução com Adaptação da Matriz de Covariância – e compará-los. Os experimentos forneceram conhecimento sobre as nuances do ataque, mostraram o potencial de cada algoritmo e a capacidade da computação evolucionária de encontrar soluções num caso restrito de ataque. Após uma revisão extensa da literatura e experimentos com ataques evolucionários específicos de imagem em cenários caixa-negra, esta dissertação propõe um novo ataque adversarial evolucionário de caixa-negra dependente de dados para a geração de UAPs em cenários não direcionados, com uma adaptação para cenários específicos de classe. O ataque usa uma customização do algoritmo de Programação Genética do TensorGP para desenvolver patches e criar exemplos adversariais. Experimentos foram realizados com um conjunto de dados de referência e limites para L∞ iguais a 3%, 5% e 10% a fim de controlar o tamanho da perturbação. Em seguida, o método foi validado por comparação com um ataque adversarial da literatura.

Outro - This work has been supported by Project "NEXUS Pacto de Inovação – Transição Verde e Digital para Transportes, Logística e Mobilidade". ref. No. 7113, supported by the Recovery and Resilience Plan (PRR) and by the European Funds Next Generation EU, following Notice No. 02/C05-i01/2022.PC645112083-00000059 (project 53), Component 5 - Capitalization and Business Innovation - Mobilizing Agendas for Business Innovation.

Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia

Country

Portugal

Related Organizations

University of Coimbra
Portugal

Keywords

Adversarial Examples, Computação Evolucionária, Genetic Programming, Programação Genética, Black-Box Adversarial Attacks, Ataques Adversariais de Caixa-Negra, Evolutionary Computation, Exemplos Adversariais

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green