Unsupervised feature representation is a challenging problem in machine learning and computer vision. Since manual labels are unavailable for training, it is difficult to reduce the gap between learned features and image semantics. This paper proposes an iterative autoencoding and clustering approach, which consists of an autoencoding sub-network and a classification sub-network, for unsupervised feature representation. On one hand, the autoencoding sub-network maps images to features. On the other hand, using the features generated by the autoencoding sub-network, the classification sub-network maps the features to classes and estimates pseudo labels by clustering the features simultaneously. Through iterations between the feature representation and the pseudo-labels-supervised classification, the gap between features and image semantics is reduced. Experimental results on handwritten digits recognition and objects classification prove that the proposed approach achieves state-of-the-art performance compared with existing methods.