Xilinx/brevitas: Release v0.9.0

Highlights Initial support for graph quantization to programmatically generate a quantized model from a floating-point one. ImageNet examples with PTQ can be found here: https://github.com/Xilinx/brevitas/tree/master/src/brevitas_examples/imagenet_classification/ptq . Initial support for QuantMultiheadAttention, which is leveraged for e.g. ViT support above. Various improvements to graph equalization, which are leveraged in the PTQ examples above. New accumulation-aware quantizers, to train for low-precision accumulation, based on our A2Q paper https://arxiv.org/abs/2301.13376 . Experimental support for BatchQuant quantizer, based on https://arxiv.org/abs/2105.08952 , currently still untested. Initial support for learned rounding. Overview of changes Graph quantization Initial graph quantization support by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/549 https://github.com/Xilinx/brevitas/pull/574 https://github.com/Xilinx/brevitas/pull/532 https://github.com/Xilinx/brevitas/pull/579 Quantized layers Initial support for QuantMultiheadAttention https://github.com/Xilinx/brevitas/pull/568 Breaking change: rename Quant(Adaptive)AvgPool to Trunc(Adaptive)AvgPool by @volcacius in https://github.com/Xilinx/brevitas/pull/562 Quantizers Weight normalization-based integer quantizers by @i-colbert in https://github.com/Xilinx/brevitas/pull/559 Accumulator-aware weight quantization by @i-colbert in https://github.com/Xilinx/brevitas/pull/567 BatchQuant quantizers support by @volcacius in https://github.com/Xilinx/brevitas/pull/563 QuantTensor Support to move QuantTensor across devices by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/528 Initial support for interpolate and pixel_shuffle by @volcacius in https://github.com/Xilinx/brevitas/pull/578 PTQ Batch Norm support in graph equalization by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/531 Mul support in graph equalization by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/530 Learned round support by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/573 MultiheadAttention and LayerNorm support in graph equalization by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/555 Fix calibration over large number of batches by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/523 Export Itemize scalar quantize args only in TorchScript QCDQ by @volcacius in https://github.com/Xilinx/brevitas/pull/561 Round avgpool export fixes by @volcacius in https://github.com/Xilinx/brevitas/pull/562 CI, linting Linter isort by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/505 CI: bump isort from 5.10.1 to 5.11.5 by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/540 Test: enable parallelism with pytest-xdist by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/513 GHA workflow improvement by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/507 Add support for yapf by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/511 FX Disable FX backport on 1.8.1+ by @volcacius in https://github.com/Xilinx/brevitas/pull/504 Examples Pretrained Resnet18 example on CIFAR10 targeting FINN by @volcacius in https://github.com/Xilinx/brevitas/pull/577 Graph quantization + PTQ examples and benchmarking scripts by @Giuseppe5 in https://github.com/Xilinx/brevitas/pull/547 https://github.com/Xilinx/brevitas/pull/575 https://github.com/Xilinx/brevitas/pull/576 For the Full Changelog please check : https://github.com/Xilinx/brevitas/compare/v0.8.0...v0.9.0

Related Organizations

National University of Science and Technology
Oman
Northwestern University
United States
University of Paderborn
Germany
University of California, San Diego
United States

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	41
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 1%

Usage byUsageCounts

visibility

views

56

56
views
Powered by

Found an issue? Give us feedback

visibility

41

Top 1%

Top 10%

Top 1%

56