Can Image Compression Rely on CLIP?

Name: Can Image Compression Rely on CLIP?
Keywords: Compression algorithms, Image coding, deep learning, Deep learning, [INFO] Computer Science [cs], image reconstruction, image processing, TK1-9971, Image processing, Image reconstruction

Tom Bachard; Thomas Maugey

Found an issue? Give us feedback

IEEE Accessarrow_drop_down

IEEE Access

Article . 2024 . Peer-reviewed

License: CC BY NC ND

Data sources: Crossref

IEEE Access

Article . 2024

Data sources: DOAJ

INRIA2

Article . 2024

License: CC BY

Data sources: INRIA2

HAL-Rennes 1

Article . 2024

License: CC BY

Data sources: HAL-Rennes 1

INRIA a CCSD electronic archive server

Article . 2024

License: CC BY

Data sources: INRIA a CCSD electronic archive server

Can Image Compression Rely on CLIP?

descriptionPublicationkeyboard_double_arrow_right Article 01 Jan 2024Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Access, volume 12, pages 78,922-78,938 (eissn: 2169-3536,

Copyright policy )Funded by:ANR | MADARE, ANR | AI4SDA

Authors: Tom Bachard; Thomas Maugey;

doi: 10.1109/access.2024.3408651

Can Image Compression Rely on CLIP?

- Summary
- Subjects
- Metrics

Abstract

Coding algorithms are usually designed to faithfully reconstruct images, which limits the expected gains in compression. A new approach based on generative models allows for new compression algorithms that can reach drastically lower compression rates. Instead of pixel fidelity, these algorithms aim at faithfully generating images that have the same high-level interpretation as their inputs. In that context, the challenge becomes to set a good representation for the semantics of an image. While text or segmentation maps have been investigated and have shown their limitations, in this paper, we ask the following question: do powerful foundation models such as CLIP provide a semantic description suited for compression? By suited for compression, we mean that this description is robust to traditional compression tools and, in particular, quantization. We show that CLIP fulfills semantic robustness properties. This makes it an interesting support for generative compression. To make that intuition concrete, we propose a proof-of-concept for a generative codec based on CLIP. Results demonstrate that our CLIP-based coder beats state-of-the-art compression pipelines at extremely low bitrates (0.0012 BPP), both in terms of image quality (65.3 for MUSIQ) and semantic preservation (0.86 for the Clip score).

Related Organizations

French National Centre for Scientific Research
France
University of Southern Brittany
France
French Institute for Research in Computer Science and Automation
France
Université de Rennes 1
France
University of Rennes
France

View all View all

Keywords

Compression algorithms, Image coding, deep learning, Deep learning, [INFO] Computer Science [cs], image reconstruction, image processing, TK1-9971, Image processing, Image reconstruction, image representation, Electrical engineering. Electronics. Nuclear engineering, Image representation, Semantic, image coding

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

gold

Funded by

ANR| MADARE, ANR| AI4SDA

Related to Research communities

INRIA