Speech recognition in the blind condition based on multiple directivity patterns using a microphone array

Toshiyuki Sekiya, Tetsunori Kobayashi

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Citations (Scopus)

    Abstract

    A novel hands free speech recognition method using a microphone array is proposed and is applied to the multi-talk recognition in the blind condition, no prior information about the sound sources and the characteristics of room acoustics. The proposed system is constructed by the cascade of the sound localization system, MUSIC, and the sound segregation system, SMDP (Segregation using Multiple Directivity Patterns) proposed in our previous paper. SMDP is characterized by using redundant directivity patterns. Usually, it is difficult for this sort of cascade system to achieve high performance because the sound localization stage cannot be perfect and errors occurred in this first stage cause serious damages to the segregation stage. Particularly missing the sound source is critical. By arranging the virtual sound sources, we treat the excess sound sources. In the proposed method, contrarily, the errors in the localization stage hardly cause the problems as long as they are insertion. SMDP uses redundant directivity patterns from the first, so it tolerates the insertion errors. The proposed method achieved 70% word accuracy in the double-talk recognition experiment of 20 K vocabulary, which is 18 point better compared to the ICA-based blind source separation with the source-number-given condition.

    Original languageEnglish
    Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    VolumeI
    ISBN (Print)0780388747, 9780780388741
    DOIs
    Publication statusPublished - 2005
    Event2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Philadelphia, PA
    Duration: 2005 Mar 182005 Mar 23

    Other

    Other2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
    CityPhiladelphia, PA
    Period05/3/1805/3/23

    Fingerprint

    speech recognition
    directivity
    Microphones
    microphones
    Speech recognition
    Acoustic waves
    acoustics
    sound localization
    Sound stages
    insertion
    cascades
    Blind source separation
    causes
    Independent component analysis
    rooms
    Acoustics
    damage
    Experiments

    ASJC Scopus subject areas

    • Signal Processing
    • Acoustics and Ultrasonics
    • Electrical and Electronic Engineering

    Cite this

    Sekiya, T., & Kobayashi, T. (2005). Speech recognition in the blind condition based on multiple directivity patterns using a microphone array. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. I). [1415128] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2005.1415128

    Speech recognition in the blind condition based on multiple directivity patterns using a microphone array. / Sekiya, Toshiyuki; Kobayashi, Tetsunori.

    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. I Institute of Electrical and Electronics Engineers Inc., 2005. 1415128.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Sekiya, T & Kobayashi, T 2005, Speech recognition in the blind condition based on multiple directivity patterns using a microphone array. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. I, 1415128, Institute of Electrical and Electronics Engineers Inc., 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05, Philadelphia, PA, 05/3/18. https://doi.org/10.1109/ICASSP.2005.1415128
    Sekiya T, Kobayashi T. Speech recognition in the blind condition based on multiple directivity patterns using a microphone array. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. I. Institute of Electrical and Electronics Engineers Inc. 2005. 1415128 https://doi.org/10.1109/ICASSP.2005.1415128
    Sekiya, Toshiyuki ; Kobayashi, Tetsunori. / Speech recognition in the blind condition based on multiple directivity patterns using a microphone array. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. I Institute of Electrical and Electronics Engineers Inc., 2005.
    @inproceedings{f2df7baa92c04ecd947a6dba6ec8f75e,
    title = "Speech recognition in the blind condition based on multiple directivity patterns using a microphone array",
    abstract = "A novel hands free speech recognition method using a microphone array is proposed and is applied to the multi-talk recognition in the blind condition, no prior information about the sound sources and the characteristics of room acoustics. The proposed system is constructed by the cascade of the sound localization system, MUSIC, and the sound segregation system, SMDP (Segregation using Multiple Directivity Patterns) proposed in our previous paper. SMDP is characterized by using redundant directivity patterns. Usually, it is difficult for this sort of cascade system to achieve high performance because the sound localization stage cannot be perfect and errors occurred in this first stage cause serious damages to the segregation stage. Particularly missing the sound source is critical. By arranging the virtual sound sources, we treat the excess sound sources. In the proposed method, contrarily, the errors in the localization stage hardly cause the problems as long as they are insertion. SMDP uses redundant directivity patterns from the first, so it tolerates the insertion errors. The proposed method achieved 70{\%} word accuracy in the double-talk recognition experiment of 20 K vocabulary, which is 18 point better compared to the ICA-based blind source separation with the source-number-given condition.",
    author = "Toshiyuki Sekiya and Tetsunori Kobayashi",
    year = "2005",
    doi = "10.1109/ICASSP.2005.1415128",
    language = "English",
    isbn = "0780388747",
    volume = "I",
    booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",

    }

    TY - GEN

    T1 - Speech recognition in the blind condition based on multiple directivity patterns using a microphone array

    AU - Sekiya, Toshiyuki

    AU - Kobayashi, Tetsunori

    PY - 2005

    Y1 - 2005

    N2 - A novel hands free speech recognition method using a microphone array is proposed and is applied to the multi-talk recognition in the blind condition, no prior information about the sound sources and the characteristics of room acoustics. The proposed system is constructed by the cascade of the sound localization system, MUSIC, and the sound segregation system, SMDP (Segregation using Multiple Directivity Patterns) proposed in our previous paper. SMDP is characterized by using redundant directivity patterns. Usually, it is difficult for this sort of cascade system to achieve high performance because the sound localization stage cannot be perfect and errors occurred in this first stage cause serious damages to the segregation stage. Particularly missing the sound source is critical. By arranging the virtual sound sources, we treat the excess sound sources. In the proposed method, contrarily, the errors in the localization stage hardly cause the problems as long as they are insertion. SMDP uses redundant directivity patterns from the first, so it tolerates the insertion errors. The proposed method achieved 70% word accuracy in the double-talk recognition experiment of 20 K vocabulary, which is 18 point better compared to the ICA-based blind source separation with the source-number-given condition.

    AB - A novel hands free speech recognition method using a microphone array is proposed and is applied to the multi-talk recognition in the blind condition, no prior information about the sound sources and the characteristics of room acoustics. The proposed system is constructed by the cascade of the sound localization system, MUSIC, and the sound segregation system, SMDP (Segregation using Multiple Directivity Patterns) proposed in our previous paper. SMDP is characterized by using redundant directivity patterns. Usually, it is difficult for this sort of cascade system to achieve high performance because the sound localization stage cannot be perfect and errors occurred in this first stage cause serious damages to the segregation stage. Particularly missing the sound source is critical. By arranging the virtual sound sources, we treat the excess sound sources. In the proposed method, contrarily, the errors in the localization stage hardly cause the problems as long as they are insertion. SMDP uses redundant directivity patterns from the first, so it tolerates the insertion errors. The proposed method achieved 70% word accuracy in the double-talk recognition experiment of 20 K vocabulary, which is 18 point better compared to the ICA-based blind source separation with the source-number-given condition.

    UR - http://www.scopus.com/inward/record.url?scp=33646801179&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=33646801179&partnerID=8YFLogxK

    U2 - 10.1109/ICASSP.2005.1415128

    DO - 10.1109/ICASSP.2005.1415128

    M3 - Conference contribution

    SN - 0780388747

    SN - 9780780388741

    VL - I

    BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -