The research on the development of the anthropomorphic flutist robot at Waseda University has been focused in emulating the anatomy and physiology of the organs involved during the flute playing from an engineering point of view, facilitating the symbiosis between the human and the robot (i.e. active interaction between musician and musical robot) and proposing novel applications for humanoid robots (i.e. music education). As a result of this research, the Waseda Flutist Robot No.4 Refined IV has been developed and a musical-based interaction system implemented so the robot is capable of interacting with musicians by processing both aural and visual cues. However; there is a trade-off relationship between the duration of the flute sound produced by the robot and the sound pressure (volume). In fact, the robot is only capable of playing sounds low-pitched sounds for long periods. In addition, a husky sound is detected while playing high-pitch sounds. From our discussions with professional players, this effect is caused due to the inner shape of the oral cavity. From this, the conversion efficiency ratio between from the exhaled air from the artificial lungs to the produced sound is too low. For this purpose, we have obtained MR images of the head from professional players in order to re-design the oral cavity of the flutist robot. A total of 5 prototypes were tested and the best one has been selected and integrated into the Waseda Flutist Robot No. 4 Refined VI (WF-4RVI). A set of experiments were proposed in order to verify the improvements of the conversion efficiency ratio as well as the sound evaluation function score. From the experimental results, we could verify the improvements compared with the previous version of the flutist robot.