A mechanical model of a human vocal system is being developed based on mechatronics technology. Although various ways of vocal sound production have been actively studied, mechanical construction is considered to advantageously realize natural vocalization with its fluid dynamics. The mechanical vocal system has several motors to manipulate the vocal tract and the vocal cords. It became possible to learn the relations between motor positions and the produced vocal sounds by an auditory feedback, and produce Japanese five vowels (a, i, u, e, o) by mimicking a human speech. In addition, the mechanical model could produce some consonant sounds by attaching a nasal cavity with the dynamic control. This paper introduces an adaptive learning algorithm for the mimicry of human vocalization, and presents a listening experiment of generated sounds for the evaluation.