This study shows that a novel type of recurrent neural network model can learn to reproduce fluctuating training sequences by inferring their stochastic structures. The network learns to predict not only the mean of the next input state, but also its time-varying variance. The network is trained through maximum likelihood estimation by utilizing the gradient descent method, and the likelihood function is expressed as a function of both the predicted mean and variance. In a numerical experiment, in order to evaluate the performance of the model, we first tested its ability to reproduce fluctuating training sequences generated by a known dynamical system that were perturbed by Gaussian noise with state-dependent variance. Our analysis showed that the network can reproduce the sequences by predicting the variance correctly. Furthermore, the other experiment showed that a humanoid robot equipped with the network can learn to reproduce fluctuating tutoring sequences by inferring latent stochastic structures hidden in the sequences.