Convolutional autoencoder-based multimodal one-class classification

Name: Convolutional autoencoder-based multimodal one-class classification
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)

Laakom, Firas; Sohrab, Fahad; Raitoharju, Jenni; Iosifidis, Alexandros; Gabbouj, Moncef

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2023

Data sources: arXiv.org e-Print Archive

https://doi.org/10.1109/cismco...

Article . 2025 . Peer-reviewed

License: STM Policy #29

Data sources: Crossref

Research.fi

Article . 2025 . Peer-reviewed

Data sources: Research.fi

https://dx.doi.org/10.48550/ar...

Article . 2023

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

Convolutional autoencoder-based multimodal one-class classification

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 17 Mar 2025Embargo end date: 01 Jan 2023Publisher:IEEEJournal:2025 IEEE Symposium on Computational Intelligence in Image, Signal Processing and Synthetic Media Companion (CISM Companion)Funded by:AKA | Taxa Identification with ...

Authors: Laakom, Firas; Sohrab, Fahad; Raitoharju, Jenni; Iosifidis, Alexandros; Gabbouj, Moncef;

doi: 10.1109/cismcompanion65074.2025.11032368 , 10.48550/arxiv.2309.14090

arXiv: 2309.14090

Convolutional autoencoder-based multimodal one-class classification

- Summary
- Subjects
- Metrics

Abstract

One-class classification refers to approaches of learning using data from a single class only. In this paper, we propose a deep learning one-class classification method suitable for multimodal data, which relies on two convolutional autoencoders jointly trained to reconstruct the positive input data while obtaining the data representations in the latent space as compact as possible. During inference, the distance of the latent representation of an input to the origin can be used as an anomaly score. Experimental results using a multimodal macroinvertebrate image classification dataset show that the proposed multimodal method yields better results as compared to the unimodal approach. Furthermore, study the effect of different input image sizes, and we investigate how recently proposed feature diversity regularizers affect the performance of our approach. We show that such regularizers improve performance.

5 pages, 1 figure, 4 tables

Related Organizations

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Funded by

AKA| Taxa Identification with Machine Learning Enhanced by DNA Metabarcoding (TIMED)

Related to Research communities

UArctic