Multiscale recurrent neural network based language model

Tsuyoshi Morioka, Tomoharu Iwata, Takaaki Hori, Tetsunori Kobayashi

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    5 Citations (Scopus)

    Abstract

    We describe a novel recurrent neural network-based language model (RNNLM) dealing with multiple time-scales of contexts. The RNNLM is now a technical standard in language model- ing because it remembers some lengths of contexts. However, the RNNLM can only deal with a single time-scale of a con- text, regardless of the subsequent words and topic of the spo- ken utterance, even though the optimal time-scale of the con- text can vary under such conditions. In contrast, our multiscale RNNLM enables incorporating with sufficient flexibility, and it makes use of various time-scales of contexts simultaneously and with proper weights for predicting the next word. Experi- mental comparisons carried out in large vocabulary spontaneous speech recognition demonstrate that introducing the multiple time-scales of contexts into the RNNLM yielded improvements over existing RNNLMs in terms of the perplexity and word er- ror rate.

    Original languageEnglish
    Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
    PublisherInternational Speech and Communication Association
    Pages2366-2370
    Number of pages5
    Volume2015-January
    Publication statusPublished - 2015
    Event16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015 - Dresden, Germany
    Duration: 2015 Sep 62015 Sep 10

    Other

    Other16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015
    CountryGermany
    CityDresden
    Period15/9/615/9/10

      Fingerprint

    Keywords

    • Multiscale dynamics
    • Recurrent neural network based language model (RNNLM)
    • Speech recognition

    ASJC Scopus subject areas

    • Language and Linguistics
    • Human-Computer Interaction
    • Signal Processing
    • Software
    • Modelling and Simulation

    Cite this

    Morioka, T., Iwata, T., Hori, T., & Kobayashi, T. (2015). Multiscale recurrent neural network based language model. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2015-January, pp. 2366-2370). International Speech and Communication Association.