
arXiv: 1805.01431
Machine learning models are vulnerable to adversarial examples. An adversary modifies the input data such that humans still assign the same label, however, machine learning models misclassify it. Previous approaches in the literature demonstrated that adversarial examples can even be generated for the remotely hosted model. In this paper, we propose a Siamese network based approach to generate adversarial examples for a multiclass target CNN. We assume that the adversary do not possess any knowledge of the target data distribution, and we use an unlabeled mismatched dataset to query the target, e.g., for the ResNet-50 target, we use the Food-101 dataset as the query. Initially, the target model assigns labels to the query dataset, and a Siamese network is trained on the image pairs derived from these multiclass labels. We learn the \emph{adversarial perturbations} for the Siamese model and show that these perturbations are also adversarial w.r.t. the target model. In experimental results, we demonstrate effectiveness of our approach on MNIST, CIFAR-10 and ImageNet targets with TinyImageNet/Food-101 query datasets.
Computer Science - Machine Learning, Statistics - Machine Learning
Computer Science - Machine Learning, Statistics - Machine Learning
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
