quantization
Caffe Implementation for Incremental network quantization
A Out-of-box PyTorch Scaffold for Neural Network Quantization-Aware-Training (QAT) Research. Website: https://github.com/zhutmost/neuralzip
Unofficial implementation of LSQ-Net, a neural network quantization framework
ProxQuant: Quantized Neural Networks via Proximal Operators
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.
Collections of model quantization algorithms. Any issues, please contact Peng Chen ([email protected])
Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.
Pytorch implementation of our paper accepted by ECCV2022 -- Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks
PyTorch implementation of SSQL (Accepted to ECCV2022 oral presentation)
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
The official implementation of the NeurIPS 2022 paper Q-ViT.
Code for "Adaptive Gradient Quantization for Data-Parallel SGD", published in NeurIPS 2020.
The official repository for the paper LAB: Learnable Activation Binarizer for Binary Neural Networks.
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
This repository containts the pytorch scripts to train mixed-precision networks for microcontroller deployment, based on the memory contraints of the target device.
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming
Reorder-based post-training quantization for large language model
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
EasyQuant(EQ) is an efficient and simple post-training quantization method via effectively optimizing the scales of weights and activations.
PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Example models using DeepSpeed