STED-Net: Self-taught encoder-decoder network for unsupervised feature representation

Songlin Du*, Takeshi Ikenaga

*この研究の対応する著者

研究成果: Article査読

抄録

Compared with the great successes achieved by supervised learning, e.g. convolutional neural network (CNN), unsupervised feature learning is still a highly-challenging task suffering from no training labels. Because of no training labels for reference, blindly reducing the gap between features and image semantics is the most challenging problem. This paper proposes a Self-Taught Encoder-Decoder Network (STED-Net), which consists of a representation sub-network and a classification sub-network, for unsupervised feature learning. On one hand, the representation sub-network maps images to feature representation. On the other hand, using the features generated by representation sub-network, classification sub-network simultaneously maps feature representation to class representation and estimates pseudo labels by clustering feature representation. By minimizing the distance between class representation and the estimated pseudo labels, STED-Net teaches the features to represent class information. Through the self-taught feature representation, the gap between features and image semantics is reduced, and the features are promoted to be more and more “class-aware”. The whole learning process of the STED-Net does not refer to any ground-truth class labels. Experimental results on widely-used image classification datasets prove that STED-Net achieves state-of-the-art classification performance compared with existing supervised and unsupervised feature learning models.

本文言語English
ページ(範囲)4673-4691
ページ数19
ジャーナルMultimedia Tools and Applications
80
3
DOI
出版ステータスPublished - 2021 1月

ASJC Scopus subject areas

  • ソフトウェア
  • メディア記述
  • ハードウェアとアーキテクチャ
  • コンピュータ ネットワークおよび通信

フィンガープリント

「STED-Net: Self-taught encoder-decoder network for unsupervised feature representation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル