Sparseness Ratio Allocation and Neuron Re-pruning for Neural Networks Compression

Li Guo, Dajiang Zhou, Jinjia Zhou, Shinji Kimura

研究成果: Conference contribution

抄録

Convolutional neural networks (CNNs) are rapidly gaining popularity in artificial intelligence applications and employed in mobile devices. However, this is challenging because of the high computational complexity of CNNs and the limited hardware resource in mobile devices. To address this issue, compressing the CNN model is an efficient solution. This work presents a new framework of model compression, with the sparseness ratio allocation (SRA) and the neuron re-pruning (NRP). To achieve a higher overall spareness ratio, SRA is exploited to determine pruned weight percentage for each layer. NRP is performed after the usual weight pruning to further reduce the relative redundant neurons in the meanwhile of guaranteeing the accuracy. From experimental results, with a slight accuracy drop of 0.1%, the proposed framework achieves 149.3× compression on lenet-5. The storage size can be reduced by about 50% relative to previous works. 8-45.2% computational energy and 11.5-48.2% memory traffic energy are saved.

本文言語English
ホスト出版物のタイトル2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781538648810
DOI
出版ステータスPublished - 2018 4 26
イベント2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018 - Florence, Italy
継続期間: 2018 5 272018 5 30

出版物シリーズ

名前Proceedings - IEEE International Symposium on Circuits and Systems
2018-May
ISSN(印刷版)0271-4310

Other

Other2018 IEEE International Symposium on Circuits and Systems, ISCAS 2018
国/地域Italy
CityFlorence
Period18/5/2718/5/30

ASJC Scopus subject areas

  • 電子工学および電気工学

フィンガープリント

「Sparseness Ratio Allocation and Neuron Re-pruning for Neural Networks Compression」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル