ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Oct 2022Embargo end date: 01 Jan 2022Publisher:IEEEJournal:2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)Funded by:FCT | EMC2, FCT | EMC2, FCT | EMC2

Authors: Guo, Cong; Zhang, Chen; Leng, Jingwen; Liu, Zihan; Yang, Fan; Liu, Yunxin; Guo, Minyi; +1 Authors

doi: 10.1109/micro56248.2022.00095 , 10.48550/arxiv.2208.14286

arXiv: 2208.14286

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Quantization is a technique to reduce the computation and memory cost of DNN models, which are getting increasingly large. Existing quantization solutions use fixed-point integer or floating-point types, which have limited benefits, as both require more bits to maintain the accuracy of original models. On the other hand, variable-length quantization uses low-bit quantization for normal values and high-precision for a fraction of outlier values. Even though this line of work brings algorithmic benefits, it also introduces significant hardware overheads due to variable-length encoding and decoding. In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads. Our data type ANT leverages two key innovations to exploit the intra-tensor and inter-tensor adaptive opportunities in DNN models. First, we propose a particular data type, flint, that combines the advantages of float and int for adapting to the importance of different values within a tensor. Second, we propose an adaptive framework that selects the best type for each tensor according to its distribution characteristics. We design a unified processing element architecture for ANT and show its ease of integration with existing DNN accelerators. Our design results in 2.8$\times$ speedup and 2.5$\times$ energy efficiency improvement over the state-of-the-art quantization accelerators.

20 pages, accepted by MICRO 2022

Related Organizations

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Machine Learning (cs.LG)

1 Research products, page 1 of 1

tensorrt software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	20
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%