Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ IRIS Cnrarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Soft Computing
Article . 2025 . Peer-reviewed
License: CC BY
Data sources: Crossref
DBLP
Article
Data sources: DBLP
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

The force of few: boosting deviance detection in data scarcity scenarios through self-supervised learning and pattern-based encoding

Authors: Francesco Folino; Gianluigi Folino; Massimo Guarascio; Luigi Pontieri;

The force of few: boosting deviance detection in data scarcity scenarios through self-supervised learning and pattern-based encoding

Abstract

Abstract In modern business environments, identifying anomalous or deviant instances in business process executions is a critical concern for enterprises and organizations. Recent advancements show that deep deviance detection models (DDMs), trained on process traces using (semi-)supervised learning techniques, outperform traditional machine learning methods. However, the effectiveness of these deep learning models often depends on large training datasets, which are not always available in practice, particularly in Green AI contexts, where data and computational resources are limited. To address these challenges, this paper presents a novel methodology for discovering deep DDMs that mitigates the impact of limited training data. Our approach incorporates an auxiliary self-supervised learning task that complements the primary deviance classification objective. In addition, we enhance the model with an autoencoder, using its reconstruction error as an additional self-supervisory signal. To promote interpretability, the model adopts a pattern-based encoding mechanism, on top of which two parallel feature-representation layers are efficiently and robustly learned through residual-like skip connections. Our method demonstrates its ability to handle the dual challenges of data efficiency and model explainability, as shown in a case study involving the execution traces of a real-world business process. The results highlight the potential of deep DDMs to achieve high performance in deviance detection, even when faced with limited data availability. Notably, our approach achieves an average performance gain (across all performance metrics) of over 15% while using only 5% of the labelled data, compared to a fully supervised baseline model, when evaluated on two publicly available logs from the current literature.

Country
Italy
Keywords

Container log analysis, Multi-task deep learning, Green AI, Process deviance detection

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    2
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
2
Top 10%
Average
Average
Green
hybrid