
doi: 10.1002/sta4.322
The identification of outliers is mainly based on unannotated data and therefore constitutes an unsupervised problem. The lack of a label leads to numerous challenges that do not occur or only occur to a lesser extent when using annotated data and supervised methods. In this paper, we focus on two of these challenges: the selection of hyperparameters and the selection of informative features. To this end, we propose a method to transform the unsupervised problem of outlier detection into a supervised problem. Benchmarking our approach against common outlier detection methods shows clear advantages of our method when many irrelevant features are present. Furthermore, the proposed approach also scores very well in the selection of hyperparameters, i.e., compared to methods with randomly selected hyperparameters.
machine learning, Statistics, hyperparameter, self-supervised learning, noisy signal, outlier detection
machine learning, Statistics, hyperparameter, self-supervised learning, noisy signal, outlier detection
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
