Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs

descriptionPublicationkeyboard_double_arrow_right Article , Conference object , Preprint 02 Sep 2024Embargo end date: 01 Jan 2023Publisher:IEEEJournal:2024 34th International Conference on Field-Programmable Logic and Applications (FPL)Funded by:EC | APROPOS

Authors: Aggarwal, Shivam; Damsgaard, Hans Jakob; Pappalardo, Alessandro; Franco, Giuseppe; Preusser, Thomas B.; Blott, Michaela; Mitra, Tulika;

doi: 10.1109/fpl64840.2024.00048 , 10.5281/zenodo.13629670 , 10.48550/arxiv.2311.12359 , 10.5281/zenodo.13629671

arXiv: 2311.12359

Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs

- Summary
- Subjects
- Metrics

Abstract

Post-training quantization (PTQ) is a powerful technique for model compression, reducing the numerical precision in neural networks without additional training overhead. Recent works have investigated adopting 8-bit floating-point formats (\code{FP8}) in the context of PTQ for model inference. However, floating-point formats smaller than 8 bits and their relative comparison in terms of accuracy-hardware cost with integers remains unexplored on FPGAs. In this work, we present minifloats, which are reduced-precision floating-point formats capable of further reducing the memory footprint, latency, and energy cost of a model while approaching full-precision model accuracy. We implement a custom FPGA-based multiply-accumulate operator library and explore the vast design space, comparing minifloat and integer representations across 3 to 8 bits for both weights and activations. We also examine the applicability of various integer-based quantization techniques to minifloats. Our experiments show that minifloats offer a promising alternative for emerging workloads such as vision transformers.

Related Organizations

Tampere University
Finland
National University of Singapore
Singapore
TAMPEREEN KORKEAKOULUSAATIO SR
Finland
Nationl University of Singapore
Singapore

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Performance, minifloats, Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG), multiply-accumulate, Performance (cs.PF), Artificial Intelligence (cs.AI), Hardware Architecture (cs.AR), quantization, Computer Science - Hardware Architecture

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Funded by

EC| APROPOS

Related to Research communities

UArctic