Hyperbolic Safety-Aware Vision-Language Models

Name: Hyperbolic Safety-Aware Vision-Language Models
Keywords: FOS: Computer and information sciences, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Computation and Language (cs.CL), Computer Science - Multimedia, Multimedia (cs.MM)

Tobia Poppi; Tejaswi Kasarla; Pascal Mettes; Lorenzo Baraldi; Rita Cucchiara

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2025

Data sources: arXiv.org e-Print Archive

https://doi.org/10.1109/cvpr52...

Article . 2025 . Peer-reviewed

License: STM Policy #29

Data sources: Crossref

IRIS UNIMORE - Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Conference object . 2025

Data sources: IRIS UNIMORE - Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

https://dx.doi.org/10.48550/ar...

Article . 2025

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

Hyperbolic Safety-Aware Vision-Language Models

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 10 Jun 2025Embargo end date: 01 Jan 2025Publisher:IEEEJournal:2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Authors: Tobia Poppi; Tejaswi Kasarla; Pascal Mettes; Lorenzo Baraldi; Rita Cucchiara;

doi: 10.1109/cvpr52734.2025.00399 , 10.48550/arxiv.2503.12127

arXiv: 2503.12127

handle: 11380/1373631

Hyperbolic Safety-Aware Vision-Language Models

- Summary
- Subjects
- Metrics

Abstract

Addressing the retrieval of unsafe content from vision-language models such as CLIP is an important step towards real-world integration. Current efforts have relied on unlearning techniques that try to erase the model's knowledge of unsafe concepts. While effective in reducing unwanted outputs, unlearning limits the model's capacity to discern between safe and unsafe content. In this work, we introduce a novel approach that shifts from unlearning to an awareness paradigm by leveraging the inherent hierarchical properties of the hyperbolic space. We propose to encode safe and unsafe content as an entailment hierarchy, where both are placed in different regions of hyperbolic space. Our HySAC, Hyperbolic Safety-Aware CLIP, employs entailment loss functions to model the hierarchical and asymmetrical relations between safe and unsafe image-text pairs. This modelling, ineffective in standard vision-language models due to their reliance on Euclidean embeddings, endows the model with awareness of unsafe content, enabling it to serve as both a multimodal unsafe classifier and a flexible content retriever, with the option to dynamically redirect unsafe queries toward safer alternatives or retain the original output. Extensive experiments show that our approach not only enhances safety recognition but also establishes a more adaptable and interpretable framework for content moderation in vision-language models. Our source code is available at https://github.com/aimagelab/HySAC.

CVPR 2025

Related Organizations

University of Amsterdam
Netherlands
University of Modena and Reggio Emilia
Italy

Keywords

FOS: Computer and information sciences, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Computation and Language (cs.CL), Computer Science - Multimedia, Multimedia (cs.MM)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Related to Research communities

Netherlands Research Portal