A conversation robot that recognizes user's head gestures and uses its results as para-linguistic information is developed. In the conversation, Humans exchange linguistic information, which can be obtained by transcription of the utterance, and para-linguistic information, which helps the transmission of linguistic information. Para-linguistic information brings a nuance that cannot be transmitted by linguistic information, and the natural and effective conversation is realized. In this paper, we recognize user's head gestures as the para-linguistic information in the visual channel. We use the optical flow over the head region as the feature and model them using HMM for the recognition. In actual conversation, while the user performs a gesture, the robot may perform a gesture, too. In this situation, the image sequence captured by the camera mounted on the eyes of the robot includes sways caused by the movement of the camera. To solve this problem, we introduced two artifices. One is for the feature extraction: the optical flow of the body area is used to compensate the swayed images. The other is for the probability models: mode-dependent models are prepared by the MLLR model adaptation technique, and the models are switched according to the motion mode of the robot. Experimental results show the effectiveness of these techniques.
|出版ステータス||Published - 2004 12月 1|
|イベント||RO-MAN 2004 - 13th IEEE International Workshop on Robot and Human Interactive Communication - Okayama, Japan|
継続期間: 2004 9月 20 → 2004 9月 22
|Conference||RO-MAN 2004 - 13th IEEE International Workshop on Robot and Human Interactive Communication|
|Period||04/9/20 → 04/9/22|
ASJC Scopus subject areas