Speech recognition of double talk using SAFIA-based audio segregation

    研究成果: Conference contribution

    抜粋

    Double-talk recognition under a distant microphone condition, a serious problem in speech applications in a real environment, is realized through use of modified SAFIA acoustic model adaptation or training. The original SAFIA is a high-performance audio segregation method based on band selection using two directivity microphones. We have modified SAFIA by adopting array signal processing have realized optimal directivity for SAFIA.We also used generalized harmonic analysis (GHA) instead of FFT for the spectral analysis in SAFIA to remove the effect of windowing which causes sound-quality degradation in SAFIA. These modifications of SAFIA enable good segregation in a human auditory sense, but the quality is still insufficient for recognition. Because SAFIA causes some particular distortion, we used MLLR-based acoustic model adaptation immunity training to be robust to the distortion of SAFIA. These efforts enabled 76.2% word accuracy under the condition that the SN ratio is 0 dB, this represents a 45% reduction in the error obtained in the case where only array signal processing was used, and a 30% error reduction compared with when only SAFIAbased audio segregation was used.

    元の言語English
    ホスト出版物のタイトルEUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology
    出版者International Speech Communication Association
    ページ1285-1288
    ページ数4
    出版物ステータスPublished - 2003
    イベント8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
    継続期間: 2003 9 12003 9 4

    Other

    Other8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
    Switzerland
    Geneva
    期間03/9/103/9/4

    ASJC Scopus subject areas

    • Computer Science Applications
    • Software
    • Linguistics and Language
    • Communication

    フィンガープリント Speech recognition of double talk using SAFIA-based audio segregation' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Sekiya, T., Ogawa, T., & Kobayashi, T. (2003). Speech recognition of double talk using SAFIA-based audio segregation. : EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology (pp. 1285-1288). International Speech Communication Association.