A-A KD: Attention and Activation Knowledge Distillation

Aorui Gou*, Chao Liu, Heming Sun, Xiaoyang Zeng, Yibo Fan

*この研究の対応する著者

研究成果: Conference contribution

抄録

We propose a knowledge distillation method named attention and activation knowledge distillation (A-A KD) in this paper. By jointly taking advantage of the attention mechanism as an inter-channel method and activation information for intra-channel, the student model can overcome the insufficiency of feature extraction and effectively mimic features of the teacher model. A-A KD can outperform the state-of-the-art performance in various tasks such as image classification, object detection, and semantic segmentation. It can improve 1.8% of mAP on PASCAL VOC07 and 1.5% of mIoU on PASCAL VOC12 than conventional student models. Moreover, experimental results show that our student model (ResNet50) can reach 21.42% of top-1 error with A-A KD, which is better than the corresponding teacher model in ImageNet.

本文言語English
ホスト出版物のタイトルProceedings - 2021 IEEE 7th International Conference on Multimedia Big Data, BigMM 2021
出版社Institute of Electrical and Electronics Engineers Inc.
ページ57-60
ページ数4
ISBN(電子版)9781665434140
DOI
出版ステータスPublished - 2021
イベント7th IEEE International Conference on Multimedia Big Data, BigMM 2021 - Taichung, Taiwan, Province of China
継続期間: 2021 11月 152021 11月 17

出版物シリーズ

名前Proceedings - 2021 IEEE 7th International Conference on Multimedia Big Data, BigMM 2021

Conference

Conference7th IEEE International Conference on Multimedia Big Data, BigMM 2021
国/地域Taiwan, Province of China
CityTaichung
Period21/11/1521/11/17

ASJC Scopus subject areas

  • コンピュータ ビジョンおよびパターン認識
  • 情報システム
  • 情報システムおよび情報管理
  • メディア記述

フィンガープリント

「A-A KD: Attention and Activation Knowledge Distillation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル