An effective use of faulty-state data is proposed to achieve robust, accurate data-driven anomaly (fault) detection for rotating machine. Although using faulty data in the training process generally can improve the performance of anomaly detection system, it is rare to obtain enough samples to train failures or defects on a target machine. We therefore utilize the existing data from non-target (different-type) machines for feature representation learning to improve anomaly detection for the target machine. Specifically, deep neural networks (DNNs) that are trained to discriminate the normal and faulty states of the non-target machines are used to extract features. The extracted features are then taken as inputs to an anomaly detector based on Gaussian mixture models (GMMs). This architecture is called DNN/GMM tandem connectionist anomaly detection. Experimental comparisons using vibration signals from actual wind turbine components demonstrated that the developed tandem connectionist system yielded significant improvements over existing systems, and that the representation learning performed robustly with respect to differences in machine types.