Ears of the robot: Three simultaneous speech segregation and recognition using robot-mounted microphones

    Research output: Contribution to journalArticle

    3 Citations (Scopus)

    Abstract

    A new type of sound source segregation method using robot-mounted microphones, which are free from strict head related transfer function (HRTF) estimation, has been proposed and successfully applied to three simultaneous speech recognition systems. The proposed segregation method is executed with sound intensity differences that are due to the particular arrangement of the four directivity microphones and the existence of a robot head acting as a sound barrier. The proposed method consists of three-layered signal processing: two-line SAFIA (binary masking based on the narrow band sound intensity comparison), two-line spectral subtraction and their integration. We performed 20 K vocabulary continuous speech recognition test in the presence of three speakers' simultaneous talk, and achieved more than 70% word error reduction compared with the case without any segregation processing.

    Original languageEnglish
    Pages (from-to)1465-1468
    Number of pages4
    JournalIEICE Transactions on Information and Systems
    VolumeE90-D
    Issue number9
    DOIs
    Publication statusPublished - 2007 Sep

    Fingerprint

    Acoustic intensity
    Microphones
    Acoustic waves
    Robots
    Speech intelligibility
    Continuous speech recognition
    Speech recognition
    Transfer functions
    Signal processing
    Processing

    Keywords

    • Robot audition
    • SAFIA
    • Sound source segregation
    • Spectral subtraction
    • Speech recognition

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Software
    • Artificial Intelligence
    • Hardware and Architecture
    • Computer Vision and Pattern Recognition

    Cite this

    @article{e14c06d6a7ed4c30b647bde19af6ca1b,
    title = "Ears of the robot: Three simultaneous speech segregation and recognition using robot-mounted microphones",
    abstract = "A new type of sound source segregation method using robot-mounted microphones, which are free from strict head related transfer function (HRTF) estimation, has been proposed and successfully applied to three simultaneous speech recognition systems. The proposed segregation method is executed with sound intensity differences that are due to the particular arrangement of the four directivity microphones and the existence of a robot head acting as a sound barrier. The proposed method consists of three-layered signal processing: two-line SAFIA (binary masking based on the narrow band sound intensity comparison), two-line spectral subtraction and their integration. We performed 20 K vocabulary continuous speech recognition test in the presence of three speakers' simultaneous talk, and achieved more than 70{\%} word error reduction compared with the case without any segregation processing.",
    keywords = "Robot audition, SAFIA, Sound source segregation, Spectral subtraction, Speech recognition",
    author = "Naoya Mochiki and Tetsuji Ogawa and Tetsunori Kobayashi",
    year = "2007",
    month = "9",
    doi = "10.1093/ietisy/e90-d.9.1465",
    language = "English",
    volume = "E90-D",
    pages = "1465--1468",
    journal = "IEICE Transactions on Information and Systems",
    issn = "0916-8532",
    publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
    number = "9",

    }

    TY - JOUR

    T1 - Ears of the robot

    T2 - Three simultaneous speech segregation and recognition using robot-mounted microphones

    AU - Mochiki, Naoya

    AU - Ogawa, Tetsuji

    AU - Kobayashi, Tetsunori

    PY - 2007/9

    Y1 - 2007/9

    N2 - A new type of sound source segregation method using robot-mounted microphones, which are free from strict head related transfer function (HRTF) estimation, has been proposed and successfully applied to three simultaneous speech recognition systems. The proposed segregation method is executed with sound intensity differences that are due to the particular arrangement of the four directivity microphones and the existence of a robot head acting as a sound barrier. The proposed method consists of three-layered signal processing: two-line SAFIA (binary masking based on the narrow band sound intensity comparison), two-line spectral subtraction and their integration. We performed 20 K vocabulary continuous speech recognition test in the presence of three speakers' simultaneous talk, and achieved more than 70% word error reduction compared with the case without any segregation processing.

    AB - A new type of sound source segregation method using robot-mounted microphones, which are free from strict head related transfer function (HRTF) estimation, has been proposed and successfully applied to three simultaneous speech recognition systems. The proposed segregation method is executed with sound intensity differences that are due to the particular arrangement of the four directivity microphones and the existence of a robot head acting as a sound barrier. The proposed method consists of three-layered signal processing: two-line SAFIA (binary masking based on the narrow band sound intensity comparison), two-line spectral subtraction and their integration. We performed 20 K vocabulary continuous speech recognition test in the presence of three speakers' simultaneous talk, and achieved more than 70% word error reduction compared with the case without any segregation processing.

    KW - Robot audition

    KW - SAFIA

    KW - Sound source segregation

    KW - Spectral subtraction

    KW - Speech recognition

    UR - http://www.scopus.com/inward/record.url?scp=68249135534&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=68249135534&partnerID=8YFLogxK

    U2 - 10.1093/ietisy/e90-d.9.1465

    DO - 10.1093/ietisy/e90-d.9.1465

    M3 - Article

    AN - SCOPUS:68249135534

    VL - E90-D

    SP - 1465

    EP - 1468

    JO - IEICE Transactions on Information and Systems

    JF - IEICE Transactions on Information and Systems

    SN - 0916-8532

    IS - 9

    ER -