Out-of-vocabulary word recognition with a hierarchical doubly Markov language model

Hiroaki Kokubo, Hirofumi Yamamoto, Yoshihiko Ogawa, Yoshinori Sagisaka, Genichiro Kikui

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    We describe a novel language model for task-dependent out-of-vocabulary (OOV) words. OOV words, such as personal names and place names in a new task can make the language model adaptation difficult. To cope with this problem, we propose a hierarchical, 2-layered language model consisting of inter-word constraints and intra-word constraints. Stochastic properties of OOV words in the two constraints are represented by multi-class modeling and trained as independent Markov models. Occurrence probabilities of an OOV word are expressed by statistics of two Markov Models (namely, doubly Markov model). The proposed model has been tested in a Japanese conversational speech database of appointment making. The word correct rate has been achieved 7.5% improvement from 78.2% to 86.7% when the new language model was used to recognize sentences with OOV words.

    Original languageEnglish
    Title of host publication2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages543-547
    Number of pages5
    ISBN (Print)0780379802, 9780780379800
    DOIs
    Publication statusPublished - 2003
    EventIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003 - St. Thomas, United States
    Duration: 2003 Nov 302003 Dec 4

    Other

    OtherIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003
    CountryUnited States
    CitySt. Thomas
    Period03/11/3003/12/4

    Fingerprint

    Statistics

    ASJC Scopus subject areas

    • Signal Processing
    • Computer Vision and Pattern Recognition
    • Computer Science Applications

    Cite this

    Kokubo, H., Yamamoto, H., Ogawa, Y., Sagisaka, Y., & Kikui, G. (2003). Out-of-vocabulary word recognition with a hierarchical doubly Markov language model. In 2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003 (pp. 543-547). [1318498] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASRU.2003.1318498

    Out-of-vocabulary word recognition with a hierarchical doubly Markov language model. / Kokubo, Hiroaki; Yamamoto, Hirofumi; Ogawa, Yoshihiko; Sagisaka, Yoshinori; Kikui, Genichiro.

    2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003. Institute of Electrical and Electronics Engineers Inc., 2003. p. 543-547 1318498.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Kokubo, H, Yamamoto, H, Ogawa, Y, Sagisaka, Y & Kikui, G 2003, Out-of-vocabulary word recognition with a hierarchical doubly Markov language model. in 2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003., 1318498, Institute of Electrical and Electronics Engineers Inc., pp. 543-547, IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003, St. Thomas, United States, 03/11/30. https://doi.org/10.1109/ASRU.2003.1318498
    Kokubo H, Yamamoto H, Ogawa Y, Sagisaka Y, Kikui G. Out-of-vocabulary word recognition with a hierarchical doubly Markov language model. In 2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003. Institute of Electrical and Electronics Engineers Inc. 2003. p. 543-547. 1318498 https://doi.org/10.1109/ASRU.2003.1318498
    Kokubo, Hiroaki ; Yamamoto, Hirofumi ; Ogawa, Yoshihiko ; Sagisaka, Yoshinori ; Kikui, Genichiro. / Out-of-vocabulary word recognition with a hierarchical doubly Markov language model. 2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003. Institute of Electrical and Electronics Engineers Inc., 2003. pp. 543-547
    @inproceedings{ffa469051fd5471089460f67a5190cc8,
    title = "Out-of-vocabulary word recognition with a hierarchical doubly Markov language model",
    abstract = "We describe a novel language model for task-dependent out-of-vocabulary (OOV) words. OOV words, such as personal names and place names in a new task can make the language model adaptation difficult. To cope with this problem, we propose a hierarchical, 2-layered language model consisting of inter-word constraints and intra-word constraints. Stochastic properties of OOV words in the two constraints are represented by multi-class modeling and trained as independent Markov models. Occurrence probabilities of an OOV word are expressed by statistics of two Markov Models (namely, doubly Markov model). The proposed model has been tested in a Japanese conversational speech database of appointment making. The word correct rate has been achieved 7.5{\%} improvement from 78.2{\%} to 86.7{\%} when the new language model was used to recognize sentences with OOV words.",
    author = "Hiroaki Kokubo and Hirofumi Yamamoto and Yoshihiko Ogawa and Yoshinori Sagisaka and Genichiro Kikui",
    year = "2003",
    doi = "10.1109/ASRU.2003.1318498",
    language = "English",
    isbn = "0780379802",
    pages = "543--547",
    booktitle = "2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",

    }

    TY - GEN

    T1 - Out-of-vocabulary word recognition with a hierarchical doubly Markov language model

    AU - Kokubo, Hiroaki

    AU - Yamamoto, Hirofumi

    AU - Ogawa, Yoshihiko

    AU - Sagisaka, Yoshinori

    AU - Kikui, Genichiro

    PY - 2003

    Y1 - 2003

    N2 - We describe a novel language model for task-dependent out-of-vocabulary (OOV) words. OOV words, such as personal names and place names in a new task can make the language model adaptation difficult. To cope with this problem, we propose a hierarchical, 2-layered language model consisting of inter-word constraints and intra-word constraints. Stochastic properties of OOV words in the two constraints are represented by multi-class modeling and trained as independent Markov models. Occurrence probabilities of an OOV word are expressed by statistics of two Markov Models (namely, doubly Markov model). The proposed model has been tested in a Japanese conversational speech database of appointment making. The word correct rate has been achieved 7.5% improvement from 78.2% to 86.7% when the new language model was used to recognize sentences with OOV words.

    AB - We describe a novel language model for task-dependent out-of-vocabulary (OOV) words. OOV words, such as personal names and place names in a new task can make the language model adaptation difficult. To cope with this problem, we propose a hierarchical, 2-layered language model consisting of inter-word constraints and intra-word constraints. Stochastic properties of OOV words in the two constraints are represented by multi-class modeling and trained as independent Markov models. Occurrence probabilities of an OOV word are expressed by statistics of two Markov Models (namely, doubly Markov model). The proposed model has been tested in a Japanese conversational speech database of appointment making. The word correct rate has been achieved 7.5% improvement from 78.2% to 86.7% when the new language model was used to recognize sentences with OOV words.

    UR - http://www.scopus.com/inward/record.url?scp=44949234998&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=44949234998&partnerID=8YFLogxK

    U2 - 10.1109/ASRU.2003.1318498

    DO - 10.1109/ASRU.2003.1318498

    M3 - Conference contribution

    AN - SCOPUS:44949234998

    SN - 0780379802

    SN - 9780780379800

    SP - 543

    EP - 547

    BT - 2003 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2003

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -