Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Estudo Geralarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Estudo Geral
Master thesis . 2022
Data sources: Estudo Geral
addClaim

Adversarial Attacks to Classification Systems

Authors: Leal, João Miguel Gouveia;

Adversarial Attacks to Classification Systems

Abstract

Amostras adversariais são inputs corrompidos com perturbações pouco visíveis, que são classificadas incorretamente por um determinado modelo alvo. Adversários criam amostras adversariais a partir de vários métodos que dependem da informação disponível sobre o sistema alvo. Num cenário white-box, os adversários tem acesso completo ao modelo enquanto que, num cenario black-box, apenas a camada mais externa do modelo está disponível. Investigadores têm desenvolvido amostras adversariais que são capazes de enganar modelos alvo mesmo quando o adversário tem quantidade mínima de informação sobre o sistema alvo do ataque. De forma a construir modelos que sejam robustos a amostras adversariais, vários autores propuseram defesas adversariais que são mecanismos com o objetivo de proteger os modelos de deep learning de ataques adversariais. No entanto, tem sido demonstrado que estas defesas falham o que indica que construir modelos robustos é uma tarefa extremamente complexa. Motivados por isto, várias ferramentas têm sido desenvolvidas que agrupam vários ataques adversariais de forma a permitir a utilizadores testarem os seus modelos, no entanto nenhuma ferramenta oferece um sistema de pipeline e a informação que dão sobre a robustez dos modelos testados é escassa. Para além disto, várias ferramentas deixaram de receber suporte o que acaba por levar a ferramentas com ataques antigos e com ataques semelhantes entre elas. Nesta dissertação, uma nova ferramenta foi desenvolvida com um mecanismo de pipeline que permite aos utilizadores introduzirem os seus modelos e escolherem, dos oito ataques atualmente suportados, aqueles que desejam usar na execução da pipeline. Após a execução da pipeline, cada modelo obtém uma pontuação baseada no desempenho que teve perante todas as imagens adversariais geradas pelos ataques adversariais de forma a permitir uma melhor compreensão da robustez do modelo. Com o intuito de testar a validez e as capacidades da ferramenta, foi realizada uma experiência com o mecanismo de pipeline, modelos treinados a partir de um dataset de classificação de imagens e dos oito ataques adversariais suportados. Os resultados permitiram compreender melhor a robustez dos modelos. A avaliação de um modelo não deverá ser baseada apenas na exatidão perante as amostras adversariais, mas também deverá considerar a perturbação que uma amostra necessita de possuir de forma a que seja capaz de enganar o modelo alvo.

Adversarial samples are inputs corrupted with inconspicuous perturbations misclassified by a given target model. Adversaries create adversarial samples using various methods that depend on the information available about the target system. In a white-box scenario, adversaries have full access to the model, and in a black-box scenario, usually, only the output layer is accessible. Researchers have developed adversarial samples that can fool target models even when the adversary has almost no information about the target system. To construct classifiers robust to adversarial samples, many authors have proposed adversarial defenses, mechanisms intended to protect deep learning models from adversarial attacks. However, many of these defenses have been shown to fail, which asserts that building robust models is an extremely arduous and complicated task to achieve. Motivated by this, there have been developed frameworks that group various adversarial attacks to allow users to test their models, however, none of them provide a pipeline mechanism and lack enough information about the robustness of the tested models. Various frameworks have also stopped receiving support, leading to frameworks with antiquated attacks and similar attacks between them. In this dissertation, a new framework was developed with a pipeline mechanism that allows users to input their models and to choose from the currently, eight adversarial attacks. After executing the pipeline, each model obtains a score based on its performance against all of the images generated by the adversarial attacks allowing for a better understanding of the robust levels of those same models. To test the validity and capabilities of the framework, an experiment was performed using the pipeline mechanism with models trained using an image classification dataset and the eight supported adversarial attacks. The results obtained allow for a deeper understanding of the robustness of the models. The evaluation of a model shouldn't be based only on the accuracy of the model on the adversarial samples but should take into consideration the amount of perturbation that a sample needs to have to be able to fool the target classifier.

Outro - Projeto confinanciado por COMPETE 2020 e pela União Europeia. Referência do projeto: POCI-01-0247-FEDER-046969

Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia

Country
Portugal
Related Organizations
Keywords

Métricas de Desempenho, Deep Learning, Aprendizagem Adversarial, Performance Metrics, Robustez, Robustness, Ataques Adversariais, Adversarial Learning, Adversarial Attacks

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green