
Ensuring the authenticity of documents is more important than ever, as forgery techniques continue to evolve. Traditional methods, which rely on predefined rules and handcrafted features, often struggle to adapt to new types of fraud. To address this, we propose a Vision Transformer-based Variational Autoencoder (ViT-VAE) designed to enhance document authentication. By combining the Vision Transformer’s ability to capture intricate details with the Variational Autoencoder’s capability to model genuine document patterns, our approach effectively detects anomalies based on reconstruction errors. This fusion of self-attention mechanisms and probabilistic modeling improves accuracy and adaptability in identifying forged elements. Our experiments on diverse datasets show that ViT-VAE outperforms conventional machine learning and deep learning methods, offering a more reliable solution for document security. These findings open the door for further advancements in fraud detection and verification technologies, strengthening trust in digital and physical documentation.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
