A talking and singing robot which adaptively learns the vocalization skill by an auditory feedback learning is being developed. The fundamental frequency and the spectrum envelope determine the principal characteristics of a sound. The former is the characteristics of a source sound generated by a vibrating object, and the latter is operated by the work of the resonance effects. In vocalization, the vibration of vocal cords generates a source sound, and then the sound wave is led to a vocal tract, which works as a resonance filter to determine the spectrum envelope. The paper describes the construction of vocal cords and a vocal tract for the realization of a talking and singing robot, together with the control algorithm for the acquisition of singing performance by mimicking human vocalization and singing voices. Generated voices were evaluated by listening experiments.