Word class modeling for speech recognition with out-of-taskwords using a hierarchical language model

Yoshihiko Ogawa, Hirofumi Yamamoto, Yoshinori Sagisaka, Genichiro Kikui

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    5 Citations (Scopus)

    Abstract

    Out-of-vocabulary (OOV) problems are frequently seen when adapting a language model to another task where there are some observed word classes but few individual words, such as names, places and other proper nouns. Simple task adaptation cannot handle this problem properly. In this paper, for task dependent OOV words in the noun category, we adopt a hierarchical language model. In this modeling, the lower class model expressing word phonotactics does not require any additional task dependent corpora for training. It can be trained independent of the upper class model of conventional word class N-grams, as the proposed hierarchical model clearly separates Inter-word characteristics and Intra-word characteristics. This independent-layered training capability makes it possible to apply this model to general vocabularies and tasks in combination with conventional language model adaptation techniques. Speech recognition experiments showed a 19-point increase in word accuracy (from 54% to 73%) in the with-OOV sentences, and comparable accuracy (85%) in the without-OOV sentences, compared with a conventional adapted model. This improvement corresponds to the performance when all OOVs are ideally registered in a dictionary.

    Original languageEnglish
    Title of host publicationEUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology
    PublisherInternational Speech Communication Association
    Pages221-224
    Number of pages4
    Publication statusPublished - 2003
    Event8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland
    Duration: 2003 Sep 12003 Sep 4

    Other

    Other8th European Conference on Speech Communication and Technology, EUROSPEECH 2003
    CountrySwitzerland
    CityGeneva
    Period03/9/103/9/4

    ASJC Scopus subject areas

    • Computer Science Applications
    • Software
    • Linguistics and Language
    • Communication

    Fingerprint Dive into the research topics of 'Word class modeling for speech recognition with out-of-taskwords using a hierarchical language model'. Together they form a unique fingerprint.

  • Cite this

    Ogawa, Y., Yamamoto, H., Sagisaka, Y., & Kikui, G. (2003). Word class modeling for speech recognition with out-of-taskwords using a hierarchical language model. In EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology (pp. 221-224). International Speech Communication Association.