Robustness Tokens: Towards Adversarial Robustness of Transformers

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Article , Preprint , Conference object 21 Nov 2024Embargo end date: 01 Jan 2025 English Publisher:Springer Nature Switzerland

Authors: Brian Pulfer; Yury Belousov 0001; Slava Voloshynovskiy;

doi: 10.1007/978-3-031-73202-7_7 , 10.48550/arxiv.2503.10191

arXiv: 2503.10191

Robustness Tokens: Towards Adversarial Robustness of Transformers

- Summary
- Subjects
- Metrics

Abstract

Recently, large pre-trained foundation models have become widely adopted by machine learning practitioners for a multitude of tasks. Given that such models are publicly available, relying on their use as backbone models for downstream tasks might result in high vulnerability to adversarial attacks crafted with the same public model. In this work, we propose Robustness Tokens, a novel approach specific to the transformer architecture that fine-tunes a few additional private tokens with low computational requirements instead of tuning model parameters as done in traditional adversarial training. We show that Robustness Tokens make Vision Transformer models significantly more robust to white-box adversarial attacks while also retaining the original downstream performances.

This paper has been accepted for publication at the European Conference on Computer Vision (ECCV), 2024

Related Organizations

University of Geneva
Switzerland

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

1

Average

Green