Robust language modeling for a small corpus of target tasks using class-combined word statistics and selective use of a general corpus

Yosuke Wada, Norihiko Kobayashi, Tetsunori Kobayashi

    研究成果: Article

    2 引用 (Scopus)

    抄録

    In order to improve the accuracy of language models in speech recognition tasks for which collecting a large text corpus for language model training is difficult, we propose a class-combined bigram and selective use of general text. In the class-combined bigram, the word bigram and the class bigram are combined using weights that are expressed as the functions of the preceding word frequency and the succeeding word-type count. An experimental has shown that the accuracy of the proposed class-combined bigram is equivalent to that of the word bigram trained with a text corpus that is approximately three times larger. In the selective use of general text, the language model was corrected by automatically selecting sentences that were expected to produce better accuracy from a large volume of text collected without specifying the task and by adding these sentences to a small corpus of target tasks. An experiment has shown that the recognition error rate was reduced by up to 12% compared to a case in which text was not selected. Lastly, when we created a model that uses both the class-combined bigram and text addition, further improvements were obtained, resulting in improvements of approximately 34% in adjusted perplexity and approximately 31% in the recognition error rate compared to the word bigram created from the target task text only.

    元の言語English
    ページ(範囲)92-102
    ページ数11
    ジャーナルSystems and Computers in Japan
    34
    発行部数12
    DOI
    出版物ステータスPublished - 2003 11 15

    Fingerprint

    Language Modeling
    Statistics
    Target
    Language Model
    Speech recognition
    Error Rate
    Text
    Corpus
    Class
    Speech Recognition
    Count
    Experiments

    ASJC Scopus subject areas

    • Hardware and Architecture
    • Information Systems
    • Theoretical Computer Science
    • Computational Theory and Mathematics

    これを引用

    @article{570142779ef04574bef5c8e4f74222e2,
    title = "Robust language modeling for a small corpus of target tasks using class-combined word statistics and selective use of a general corpus",
    abstract = "In order to improve the accuracy of language models in speech recognition tasks for which collecting a large text corpus for language model training is difficult, we propose a class-combined bigram and selective use of general text. In the class-combined bigram, the word bigram and the class bigram are combined using weights that are expressed as the functions of the preceding word frequency and the succeeding word-type count. An experimental has shown that the accuracy of the proposed class-combined bigram is equivalent to that of the word bigram trained with a text corpus that is approximately three times larger. In the selective use of general text, the language model was corrected by automatically selecting sentences that were expected to produce better accuracy from a large volume of text collected without specifying the task and by adding these sentences to a small corpus of target tasks. An experiment has shown that the recognition error rate was reduced by up to 12{\%} compared to a case in which text was not selected. Lastly, when we created a model that uses both the class-combined bigram and text addition, further improvements were obtained, resulting in improvements of approximately 34{\%} in adjusted perplexity and approximately 31{\%} in the recognition error rate compared to the word bigram created from the target task text only.",
    keywords = "Class N-gram, Language model, Large-vocabulary continuous speech recognition, Task adaptation",
    author = "Yosuke Wada and Norihiko Kobayashi and Tetsunori Kobayashi",
    year = "2003",
    month = "11",
    day = "15",
    doi = "10.1002/scj.1219",
    language = "English",
    volume = "34",
    pages = "92--102",
    journal = "Systems and Computers in Japan",
    issn = "0882-1666",
    publisher = "John Wiley and Sons Inc.",
    number = "12",

    }

    TY - JOUR

    T1 - Robust language modeling for a small corpus of target tasks using class-combined word statistics and selective use of a general corpus

    AU - Wada, Yosuke

    AU - Kobayashi, Norihiko

    AU - Kobayashi, Tetsunori

    PY - 2003/11/15

    Y1 - 2003/11/15

    N2 - In order to improve the accuracy of language models in speech recognition tasks for which collecting a large text corpus for language model training is difficult, we propose a class-combined bigram and selective use of general text. In the class-combined bigram, the word bigram and the class bigram are combined using weights that are expressed as the functions of the preceding word frequency and the succeeding word-type count. An experimental has shown that the accuracy of the proposed class-combined bigram is equivalent to that of the word bigram trained with a text corpus that is approximately three times larger. In the selective use of general text, the language model was corrected by automatically selecting sentences that were expected to produce better accuracy from a large volume of text collected without specifying the task and by adding these sentences to a small corpus of target tasks. An experiment has shown that the recognition error rate was reduced by up to 12% compared to a case in which text was not selected. Lastly, when we created a model that uses both the class-combined bigram and text addition, further improvements were obtained, resulting in improvements of approximately 34% in adjusted perplexity and approximately 31% in the recognition error rate compared to the word bigram created from the target task text only.

    AB - In order to improve the accuracy of language models in speech recognition tasks for which collecting a large text corpus for language model training is difficult, we propose a class-combined bigram and selective use of general text. In the class-combined bigram, the word bigram and the class bigram are combined using weights that are expressed as the functions of the preceding word frequency and the succeeding word-type count. An experimental has shown that the accuracy of the proposed class-combined bigram is equivalent to that of the word bigram trained with a text corpus that is approximately three times larger. In the selective use of general text, the language model was corrected by automatically selecting sentences that were expected to produce better accuracy from a large volume of text collected without specifying the task and by adding these sentences to a small corpus of target tasks. An experiment has shown that the recognition error rate was reduced by up to 12% compared to a case in which text was not selected. Lastly, when we created a model that uses both the class-combined bigram and text addition, further improvements were obtained, resulting in improvements of approximately 34% in adjusted perplexity and approximately 31% in the recognition error rate compared to the word bigram created from the target task text only.

    KW - Class N-gram

    KW - Language model

    KW - Large-vocabulary continuous speech recognition

    KW - Task adaptation

    UR - http://www.scopus.com/inward/record.url?scp=0142091467&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=0142091467&partnerID=8YFLogxK

    U2 - 10.1002/scj.1219

    DO - 10.1002/scj.1219

    M3 - Article

    AN - SCOPUS:0142091467

    VL - 34

    SP - 92

    EP - 102

    JO - Systems and Computers in Japan

    JF - Systems and Computers in Japan

    SN - 0882-1666

    IS - 12

    ER -