Effcient weights quantization of convolutional neural networks using kernel density estimation based non-uniform quantizer

Sanghyun Seo, Juntae Kim

Research output: Contribution to journalArticlepeer-review

31 Scopus citations

Abstract

Convolutional neural networks (CNN) have achieved excellent results in the field of image recognition that classifies objects in images. A typical CNN consists of a deep architecture that uses a large number of weights and layers to achieve high performance. CNN requires relatively large memory space and computational costs, which not only increase the time to train the model but also limit the real-time application of the trained model. For this reason, various neural network compression methodologies have been studied to effciently use CNN in small embedded hardware such as mobile and edge devices. In this paper, we propose a kernel density estimation based non-uniform quantization methodology that can perform compression effciently. The proposed method performs effcient weights quantization using a significantly smaller number of sampled weights than the number of original weights. Four-bit quantization experiments on the classification of the ImageNet dataset with various CNN architectures show that the proposed methodology can perform weights quantization effciently in terms of computational costs without significant reduction in model performance.

Original languageEnglish
Article number559
JournalApplied Sciences (Switzerland)
Volume9
Issue number12
DOIs
StatePublished - 1 Jun 2019

Keywords

  • Convolutional neural networks
  • K-means clustering
  • Kernel density estimation
  • Lloyd-Max quantizer
  • Weights quantization

Fingerprint

Dive into the research topics of 'Effcient weights quantization of convolutional neural networks using kernel density estimation based non-uniform quantizer'. Together they form a unique fingerprint.

Cite this