Because the minority languages in China have their special characteristics, it is not suitable to directly adopt the traditional automatic speech recognition (ASR) methods which are used for some major languages, such as Chinese, English, Japanese, etc. In this paper, we take Mongolian (a resource-deficient language) as an example and build the acoustic and language models for applying the ATRASR system. In this paper, we specially focus on the language modeling aspect by considering the special characteristics of the Mongolian. We trained a multi-class N-gram language model based on similar word clustering. By applying the proposed language model, the system could improve the performance by 5.5% compared with the conventional word N-gram.
ASJC Scopus subject areas
- コンピュータ グラフィックスおよびコンピュータ支援設計