A conversation robot using head gesture recognition as para-linguistic information

Shinya Fujie, Yasushi Ejiri, Kei Nakajima, Yosuke Matsusaka, Tetsunori Kobayashi

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    43 Citations (Scopus)

    Abstract

    A conversation robot that recognizes user's head gestures and uses its results as para-linguistic information is developed. In the conversation, Humans exchange linguistic information, which can be obtained by transcription of the utterance, and para-linguistic information, which helps the transmission of linguistic information. Para-linguistic information brings a nuance that cannot be transmitted by linguistic information, and the natural and effective conversation is realized. In this paper, we recognize user's head gestures as the para-linguistic information in the visual channel. We use the optical flow over the head region as the feature and model them using HMM for the recognition. In actual conversation, while the user performs a gesture, the robot may perform a gesture, too. In this situation, the image sequence captured by the camera mounted on the eyes of the robot includes sways caused by the movement of the camera. To solve this problem, we introduced two artifices. One is for the feature extraction: the optical flow of the body area is used to compensate the swayed images. The other is for the probability models: mode-dependent models are prepared by the MLLR model adaptation technique, and the models are switched according to the motion mode of the robot. Experimental results show the effectiveness of these techniques.

    Original languageEnglish
    Title of host publicationProceedings - IEEE International Workshop on Robot and Human Interactive Communication
    Pages159-164
    Number of pages6
    Publication statusPublished - 2004
    EventRO-MAN 2004 - 13th IEEE International Workshop on Robot and Human Interactive Communication - Okayama
    Duration: 2004 Sep 202004 Sep 22

    Other

    OtherRO-MAN 2004 - 13th IEEE International Workshop on Robot and Human Interactive Communication
    CityOkayama
    Period04/9/2004/9/22

    Fingerprint

    Gesture recognition
    Linguistics
    Robots
    Optical flows
    Cameras
    Transcription
    Feature extraction

    ASJC Scopus subject areas

    • Engineering(all)

    Cite this

    Fujie, S., Ejiri, Y., Nakajima, K., Matsusaka, Y., & Kobayashi, T. (2004). A conversation robot using head gesture recognition as para-linguistic information. In Proceedings - IEEE International Workshop on Robot and Human Interactive Communication (pp. 159-164)

    A conversation robot using head gesture recognition as para-linguistic information. / Fujie, Shinya; Ejiri, Yasushi; Nakajima, Kei; Matsusaka, Yosuke; Kobayashi, Tetsunori.

    Proceedings - IEEE International Workshop on Robot and Human Interactive Communication. 2004. p. 159-164.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Fujie, S, Ejiri, Y, Nakajima, K, Matsusaka, Y & Kobayashi, T 2004, A conversation robot using head gesture recognition as para-linguistic information. in Proceedings - IEEE International Workshop on Robot and Human Interactive Communication. pp. 159-164, RO-MAN 2004 - 13th IEEE International Workshop on Robot and Human Interactive Communication, Okayama, 04/9/20.
    Fujie S, Ejiri Y, Nakajima K, Matsusaka Y, Kobayashi T. A conversation robot using head gesture recognition as para-linguistic information. In Proceedings - IEEE International Workshop on Robot and Human Interactive Communication. 2004. p. 159-164
    Fujie, Shinya ; Ejiri, Yasushi ; Nakajima, Kei ; Matsusaka, Yosuke ; Kobayashi, Tetsunori. / A conversation robot using head gesture recognition as para-linguistic information. Proceedings - IEEE International Workshop on Robot and Human Interactive Communication. 2004. pp. 159-164
    @inproceedings{dda4d37d4510431e8561930d43e7c789,
    title = "A conversation robot using head gesture recognition as para-linguistic information",
    abstract = "A conversation robot that recognizes user's head gestures and uses its results as para-linguistic information is developed. In the conversation, Humans exchange linguistic information, which can be obtained by transcription of the utterance, and para-linguistic information, which helps the transmission of linguistic information. Para-linguistic information brings a nuance that cannot be transmitted by linguistic information, and the natural and effective conversation is realized. In this paper, we recognize user's head gestures as the para-linguistic information in the visual channel. We use the optical flow over the head region as the feature and model them using HMM for the recognition. In actual conversation, while the user performs a gesture, the robot may perform a gesture, too. In this situation, the image sequence captured by the camera mounted on the eyes of the robot includes sways caused by the movement of the camera. To solve this problem, we introduced two artifices. One is for the feature extraction: the optical flow of the body area is used to compensate the swayed images. The other is for the probability models: mode-dependent models are prepared by the MLLR model adaptation technique, and the models are switched according to the motion mode of the robot. Experimental results show the effectiveness of these techniques.",
    author = "Shinya Fujie and Yasushi Ejiri and Kei Nakajima and Yosuke Matsusaka and Tetsunori Kobayashi",
    year = "2004",
    language = "English",
    pages = "159--164",
    booktitle = "Proceedings - IEEE International Workshop on Robot and Human Interactive Communication",

    }

    TY - GEN

    T1 - A conversation robot using head gesture recognition as para-linguistic information

    AU - Fujie, Shinya

    AU - Ejiri, Yasushi

    AU - Nakajima, Kei

    AU - Matsusaka, Yosuke

    AU - Kobayashi, Tetsunori

    PY - 2004

    Y1 - 2004

    N2 - A conversation robot that recognizes user's head gestures and uses its results as para-linguistic information is developed. In the conversation, Humans exchange linguistic information, which can be obtained by transcription of the utterance, and para-linguistic information, which helps the transmission of linguistic information. Para-linguistic information brings a nuance that cannot be transmitted by linguistic information, and the natural and effective conversation is realized. In this paper, we recognize user's head gestures as the para-linguistic information in the visual channel. We use the optical flow over the head region as the feature and model them using HMM for the recognition. In actual conversation, while the user performs a gesture, the robot may perform a gesture, too. In this situation, the image sequence captured by the camera mounted on the eyes of the robot includes sways caused by the movement of the camera. To solve this problem, we introduced two artifices. One is for the feature extraction: the optical flow of the body area is used to compensate the swayed images. The other is for the probability models: mode-dependent models are prepared by the MLLR model adaptation technique, and the models are switched according to the motion mode of the robot. Experimental results show the effectiveness of these techniques.

    AB - A conversation robot that recognizes user's head gestures and uses its results as para-linguistic information is developed. In the conversation, Humans exchange linguistic information, which can be obtained by transcription of the utterance, and para-linguistic information, which helps the transmission of linguistic information. Para-linguistic information brings a nuance that cannot be transmitted by linguistic information, and the natural and effective conversation is realized. In this paper, we recognize user's head gestures as the para-linguistic information in the visual channel. We use the optical flow over the head region as the feature and model them using HMM for the recognition. In actual conversation, while the user performs a gesture, the robot may perform a gesture, too. In this situation, the image sequence captured by the camera mounted on the eyes of the robot includes sways caused by the movement of the camera. To solve this problem, we introduced two artifices. One is for the feature extraction: the optical flow of the body area is used to compensate the swayed images. The other is for the probability models: mode-dependent models are prepared by the MLLR model adaptation technique, and the models are switched according to the motion mode of the robot. Experimental results show the effectiveness of these techniques.

    UR - http://www.scopus.com/inward/record.url?scp=18544390764&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=18544390764&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:18544390764

    SP - 159

    EP - 164

    BT - Proceedings - IEEE International Workshop on Robot and Human Interactive Communication

    ER -