A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game

Izaya Nishimuta, Naoki Hirayama, Kazuyoshi Yoshii, Katsutoshi Itoyama, Hiroshi G. Okuno

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Citations (Scopus)

    Abstract

    This paper presents an interactive humanoid robot that can moderate a multi-player fastest-voice-first-type quiz game by leveraging state-of-the-art robot audition techniques such as sound source localization and separation and speech recognition. In this game, a player who says 'Yes' first gets a right to answer a question, and players are allowed to barge in a questionary utterance of the quizmaster. The robot needs to identify which player says 'Yes' first, even if multiple players respond at almost exactly the same time, and must judge the correctness of the answer given by the player. To enable natural human-robot interaction, we believe that the robot should use its own microphones (i.e., ears) embedded in the head, rather than having pin microphones attached to individual players. In this paper we use a robot audition system called HARK for separating the mixture of audio signals recorded by the ears into multiple source signals (i.e., almost the simultaneous utterances of 'Yes' and the questionary utterance) and estimating the direction of each source. To judge the correctness of an answer, we use a speech recognizer called Julius. Experimental results showed that our robot can correctly identify which player spoke first when the players' utterances differed by 60 msec.

    Original languageEnglish
    Title of host publicationIEEE-RAS International Conference on Humanoid Robots
    PublisherIEEE Computer Society
    Pages967-972
    Number of pages6
    Volume2015-February
    ISBN (Print)9781479971749
    DOIs
    Publication statusPublished - 2015 Feb 12
    Event2014 14th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2014 - Madrid, Spain
    Duration: 2014 Nov 182014 Nov 20

    Other

    Other2014 14th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2014
    CountrySpain
    CityMadrid
    Period14/11/1814/11/20

    Fingerprint

    Robots
    Audition
    Microphones
    Human robot interaction
    Barges
    Speech recognition
    Acoustic waves

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Computer Vision and Pattern Recognition
    • Hardware and Architecture
    • Human-Computer Interaction
    • Electrical and Electronic Engineering

    Cite this

    Nishimuta, I., Hirayama, N., Yoshii, K., Itoyama, K., & Okuno, H. G. (2015). A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game. In IEEE-RAS International Conference on Humanoid Robots (Vol. 2015-February, pp. 967-972). [7041480] IEEE Computer Society. https://doi.org/10.1109/HUMANOIDS.2014.7041480

    A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game. / Nishimuta, Izaya; Hirayama, Naoki; Yoshii, Kazuyoshi; Itoyama, Katsutoshi; Okuno, Hiroshi G.

    IEEE-RAS International Conference on Humanoid Robots. Vol. 2015-February IEEE Computer Society, 2015. p. 967-972 7041480.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Nishimuta, I, Hirayama, N, Yoshii, K, Itoyama, K & Okuno, HG 2015, A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game. in IEEE-RAS International Conference on Humanoid Robots. vol. 2015-February, 7041480, IEEE Computer Society, pp. 967-972, 2014 14th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2014, Madrid, Spain, 14/11/18. https://doi.org/10.1109/HUMANOIDS.2014.7041480
    Nishimuta I, Hirayama N, Yoshii K, Itoyama K, Okuno HG. A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game. In IEEE-RAS International Conference on Humanoid Robots. Vol. 2015-February. IEEE Computer Society. 2015. p. 967-972. 7041480 https://doi.org/10.1109/HUMANOIDS.2014.7041480
    Nishimuta, Izaya ; Hirayama, Naoki ; Yoshii, Kazuyoshi ; Itoyama, Katsutoshi ; Okuno, Hiroshi G. / A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game. IEEE-RAS International Conference on Humanoid Robots. Vol. 2015-February IEEE Computer Society, 2015. pp. 967-972
    @inproceedings{48e6d607da244741a368928ecc497f8f,
    title = "A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game",
    abstract = "This paper presents an interactive humanoid robot that can moderate a multi-player fastest-voice-first-type quiz game by leveraging state-of-the-art robot audition techniques such as sound source localization and separation and speech recognition. In this game, a player who says 'Yes' first gets a right to answer a question, and players are allowed to barge in a questionary utterance of the quizmaster. The robot needs to identify which player says 'Yes' first, even if multiple players respond at almost exactly the same time, and must judge the correctness of the answer given by the player. To enable natural human-robot interaction, we believe that the robot should use its own microphones (i.e., ears) embedded in the head, rather than having pin microphones attached to individual players. In this paper we use a robot audition system called HARK for separating the mixture of audio signals recorded by the ears into multiple source signals (i.e., almost the simultaneous utterances of 'Yes' and the questionary utterance) and estimating the direction of each source. To judge the correctness of an answer, we use a speech recognizer called Julius. Experimental results showed that our robot can correctly identify which player spoke first when the players' utterances differed by 60 msec.",
    author = "Izaya Nishimuta and Naoki Hirayama and Kazuyoshi Yoshii and Katsutoshi Itoyama and Okuno, {Hiroshi G.}",
    year = "2015",
    month = "2",
    day = "12",
    doi = "10.1109/HUMANOIDS.2014.7041480",
    language = "English",
    isbn = "9781479971749",
    volume = "2015-February",
    pages = "967--972",
    booktitle = "IEEE-RAS International Conference on Humanoid Robots",
    publisher = "IEEE Computer Society",

    }

    TY - GEN

    T1 - A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game

    AU - Nishimuta, Izaya

    AU - Hirayama, Naoki

    AU - Yoshii, Kazuyoshi

    AU - Itoyama, Katsutoshi

    AU - Okuno, Hiroshi G.

    PY - 2015/2/12

    Y1 - 2015/2/12

    N2 - This paper presents an interactive humanoid robot that can moderate a multi-player fastest-voice-first-type quiz game by leveraging state-of-the-art robot audition techniques such as sound source localization and separation and speech recognition. In this game, a player who says 'Yes' first gets a right to answer a question, and players are allowed to barge in a questionary utterance of the quizmaster. The robot needs to identify which player says 'Yes' first, even if multiple players respond at almost exactly the same time, and must judge the correctness of the answer given by the player. To enable natural human-robot interaction, we believe that the robot should use its own microphones (i.e., ears) embedded in the head, rather than having pin microphones attached to individual players. In this paper we use a robot audition system called HARK for separating the mixture of audio signals recorded by the ears into multiple source signals (i.e., almost the simultaneous utterances of 'Yes' and the questionary utterance) and estimating the direction of each source. To judge the correctness of an answer, we use a speech recognizer called Julius. Experimental results showed that our robot can correctly identify which player spoke first when the players' utterances differed by 60 msec.

    AB - This paper presents an interactive humanoid robot that can moderate a multi-player fastest-voice-first-type quiz game by leveraging state-of-the-art robot audition techniques such as sound source localization and separation and speech recognition. In this game, a player who says 'Yes' first gets a right to answer a question, and players are allowed to barge in a questionary utterance of the quizmaster. The robot needs to identify which player says 'Yes' first, even if multiple players respond at almost exactly the same time, and must judge the correctness of the answer given by the player. To enable natural human-robot interaction, we believe that the robot should use its own microphones (i.e., ears) embedded in the head, rather than having pin microphones attached to individual players. In this paper we use a robot audition system called HARK for separating the mixture of audio signals recorded by the ears into multiple source signals (i.e., almost the simultaneous utterances of 'Yes' and the questionary utterance) and estimating the direction of each source. To judge the correctness of an answer, we use a speech recognizer called Julius. Experimental results showed that our robot can correctly identify which player spoke first when the players' utterances differed by 60 msec.

    UR - http://www.scopus.com/inward/record.url?scp=84945177299&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84945177299&partnerID=8YFLogxK

    U2 - 10.1109/HUMANOIDS.2014.7041480

    DO - 10.1109/HUMANOIDS.2014.7041480

    M3 - Conference contribution

    SN - 9781479971749

    VL - 2015-February

    SP - 967

    EP - 972

    BT - IEEE-RAS International Conference on Humanoid Robots

    PB - IEEE Computer Society

    ER -