Sparseness Ratio Allocation and Neuron Re-pruning for Neural Networks Compression

Li Guo, Dajiang Zhou, Jinjia Zhou, Shinji Kimura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Convolutional neural networks (CNNs) are rapidly gaining popularity in artificial intelligence applications and employed in mobile devices. However, this is challenging because of the high computational complexity of CNNs and the limited hardware resource in mobile devices. To address this issue, compressing the CNN model is an efficient solution. This work presents a new framework of model compression, with the sparseness ratio allocation (SRA) and the neuron re-pruning (NRP). To achieve a higher overall spareness ratio, SRA is exploited to determine pruned weight percentage for each layer. NRP is performed after the usual weight pruning to further reduce the relative redundant neurons in the meanwhile of guaranteeing the accuracy. From experimental results, with a slight accuracy drop of 0.1%, the proposed framework achieves 149.3× compression on lenet-5. The storage size can be reduced by about 50% relative to previous works. 8-45.2% computational energy and 11.5-48.2% memory traffic energy are saved.

Original languageEnglish
Title of host publication2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538648810
DOIs
Publication statusPublished - 2018 Apr 26
Event2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018 - Florence, Italy
Duration: 2018 May 272018 May 30

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
Volume2018-May
ISSN (Print)0271-4310

Other

Other2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018
CountryItaly
CityFlorence
Period18/5/2718/5/30

    Fingerprint

Keywords

  • Model compression
  • connection/neuron pruning
  • neuron re-pruning
  • sparseness ratio allocation

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Guo, L., Zhou, D., Zhou, J., & Kimura, S. (2018). Sparseness Ratio Allocation and Neuron Re-pruning for Neural Networks Compression. In 2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018 - Proceedings [8351094] (Proceedings - IEEE International Symposium on Circuits and Systems; Vol. 2018-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISCAS.2018.8351094