Name: Disturbing Image Detection Using LMM-Elicited Emotion Embeddings
Keywords: FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 27 Oct 2024Embargo end date: 01 Jan 2024Publisher:IEEEJournal:2024 IEEE International Conference on Image Processing Challenges and Workshops (ICIPCW)

Authors: Tzelepi, Maria; Mezaris, Vasileios;

doi: 10.1109/icipcw64161.2024.10769133 , 10.48550/arxiv.2406.12668

arXiv: http://arxiv.org/abs/2406.12668

Disturbing Image Detection Using LMM-Elicited Emotion Embeddings

- Summary
- Subjects
- Metrics

Abstract

In this paper we deal with the task of Disturbing Image Detection (DID), exploiting knowledge encoded in Large Multimodal Models (LMMs). Specifically, we propose to exploit LMM knowledge in a two-fold manner: first by extracting generic semantic descriptions, and second by extracting elicited emotions. Subsequently, we use the CLIP's text encoder in order to obtain the text embeddings of both the generic semantic descriptions and LMM-elicited emotions. Finally, we use the aforementioned text embeddings along with the corresponding CLIP's image embeddings for performing the DID task. The proposed method significantly improves the baseline classification accuracy, achieving state-of-the-art performance on the augmented Disturbing Image Detection dataset.

Accepted for publication, LVLM Workshop @ IEEE Int. Conf. on Image Processing (ICIP 2024), Abu Dhabi, United Arab Emirates, Oct. 2024. This is the authors' "accepted version"

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Top 10%

Average

Green