Statistical method of building dialect language models for ASR systems

Naoki Hirayama*, Shinsuke Mori, Hiroshi G. Okuno

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

This paper develops a new statistical method of building language models (LMs) of Japanese dialects for automatic speech recognition (ASR). One possible application is to recognize a variety of utterances in our daily lives. The most crucial problem in training language models for dialects is the shortage of linguistic corpora in dialects. Our solution is to transform linguistic corpora into dialects at a level of pronunciations of words. We develop phonemesequence transducers based on weighted finite-state transducers (WFSTs). Each word in common language (CL) corpora is automatically labelled as dialect word pronunciations. For example, anta (Kansai dialect) is labelled anata (the most common representation of 'you' in Japanese). Phoneme-sequence transducers are trained from parallel corpora of a dialect and CL. We evaluate the word recognition accuracy of our ASR system. Our method outperforms the ASR system with LMs trained from untransformed corpora in written language by 9.9 points.

Original languageEnglish
Title of host publication24th International Conference on Computational Linguistics - Proceedings of COLING 2012: Technical Papers
Pages1179-1194
Number of pages16
Publication statusPublished - 2012
Externally publishedYes
Event24th International Conference on Computational Linguistics, COLING 2012 - Mumbai
Duration: 2012 Dec 82012 Dec 15

Other

Other24th International Conference on Computational Linguistics, COLING 2012
CityMumbai
Period12/12/812/12/15

Keywords

  • Dialect
  • Language model
  • Spoken language
  • Weighted finite-state transducer (WFST)

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Statistical method of building dialect language models for ASR systems'. Together they form a unique fingerprint.

Cite this