Language model domain adaptation via recurrent neural networks with domain-shared and domain-specific representations

Tsuyoshi Moriokal, Naohiro Tawara, Tetsuji Ogawa, Atsunori Ogawa, Tomoharu Iwata, Tetsunori Kobayashi

    研究成果: Conference contribution

    2 引用 (Scopus)

    抜粋

    Training recurrent neural network language models (RNNLMs) requires a large amount of data, which is difficult to collect for specific domains such as multiparty conversations. Data augmentation using external resources and model adaptation, which adjusts a model trained on a large amount of data to a target domain, have been proposed for low-resource language modeling. While there are the commonalities and discrepancies between the source and target domains in terms of the statistics of words and their contexts, these methods for domain adaptation make the commonalities and discrepancies jumbled. We propose novel domain adaptation techniques for RNNLM by introducing domain-shared and domain-specific word embedding and contextual features. This explicit modeling of the commonalities and discrepancies would improve the language modeling performance. Experimental comparisons using multiparty conversation data as the target domain augmented by lecture data from the source domain demonstrate that the proposed domain adaptation method exhibits improvements in the perplexity and word error rate over the long short-term memory based language model (LSTMLM) trained using the source and target domain data.

    元の言語English
    ホスト出版物のタイトル2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
    出版者Institute of Electrical and Electronics Engineers Inc.
    ページ6084-6088
    ページ数5
    2018-April
    ISBN(印刷物)9781538646588
    DOI
    出版物ステータスPublished - 2018 9 10
    イベント2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
    継続期間: 2018 4 152018 4 20

    Other

    Other2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
    Canada
    Calgary
    期間18/4/1518/4/20

      フィンガープリント

    ASJC Scopus subject areas

    • Software
    • Signal Processing
    • Electrical and Electronic Engineering

    これを引用

    Moriokal, T., Tawara, N., Ogawa, T., Ogawa, A., Iwata, T., & Kobayashi, T. (2018). Language model domain adaptation via recurrent neural networks with domain-shared and domain-specific representations. : 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings (巻 2018-April, pp. 6084-6088). [8462631] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2018.8462631