STED-Net: Self-taught encoder-decoder network for unsupervised feature representation

Songlin Du*, Takeshi Ikenaga

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Compared with the great successes achieved by supervised learning, e.g. convolutional neural network (CNN), unsupervised feature learning is still a highly-challenging task suffering from no training labels. Because of no training labels for reference, blindly reducing the gap between features and image semantics is the most challenging problem. This paper proposes a Self-Taught Encoder-Decoder Network (STED-Net), which consists of a representation sub-network and a classification sub-network, for unsupervised feature learning. On one hand, the representation sub-network maps images to feature representation. On the other hand, using the features generated by representation sub-network, classification sub-network simultaneously maps feature representation to class representation and estimates pseudo labels by clustering feature representation. By minimizing the distance between class representation and the estimated pseudo labels, STED-Net teaches the features to represent class information. Through the self-taught feature representation, the gap between features and image semantics is reduced, and the features are promoted to be more and more “class-aware”. The whole learning process of the STED-Net does not refer to any ground-truth class labels. Experimental results on widely-used image classification datasets prove that STED-Net achieves state-of-the-art classification performance compared with existing supervised and unsupervised feature learning models.

Original languageEnglish
Pages (from-to)4673-4691
Number of pages19
JournalMultimedia Tools and Applications
Issue number3
Publication statusPublished - 2021 Jan


  • Autoencoder
  • Feature representation
  • Self-taught learning
  • Unsupervised learning

ASJC Scopus subject areas

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications


Dive into the research topics of 'STED-Net: Self-taught encoder-decoder network for unsupervised feature representation'. Together they form a unique fingerprint.

Cite this