
arXiv: 2404.08131
Abstract We present a post-training quantization algorithm with error estimates relying on ideas originating from frame theory. Specifically, we use first-order Sigma-Delta ( $$\Sigma \Delta $$ Σ Δ ) quantization for finite unit-norm tight frames to quantize weight matrices and biases in a neural network. In our scenario, we derive an error bound between the original neural network and the quantized neural network in terms of step size and the number of frame elements. We also demonstrate how to leverage the redundancy of frames to achieve a quantized neural network with higher accuracy.
FOS: Computer and information sciences, Computer Science - Machine Learning, post-training quantization, Statistics - Machine Learning, Computer Science - Information Theory, Information Theory (cs.IT), Learning and adaptive systems in artificial intelligence, sigma-delta quantization, finite frames, Machine Learning (stat.ML), General harmonic expansions, frames, neural network quantization, Machine Learning (cs.LG)
FOS: Computer and information sciences, Computer Science - Machine Learning, post-training quantization, Statistics - Machine Learning, Computer Science - Information Theory, Information Theory (cs.IT), Learning and adaptive systems in artificial intelligence, sigma-delta quantization, finite frames, Machine Learning (stat.ML), General harmonic expansions, frames, neural network quantization, Machine Learning (cs.LG)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
