The Japanese Dictation Toolkit has been designed and developed as a baseline platform for Japanese LVCSR (Large Vocabulary Continuous Speech Recognition). The platform consists of a standard recognition engine, Japanese phone models and Japanese statistical language models. We set up a variety of Japanese phone HMMs from a context-independent monophone to a triphone model of thousands of states. They are trained with ASJ (The Acoustical Society of Japan) databases. A lexicon and word N-gram (2-gram and 3-gram) models are constructed with a corpus of Mainichi newspaper. The recognition engine JULIUS is developed for evaluation of both acoustic and language models. As an integrated system of these modules, we have implemented a baseline 5,000-word dictation system and evaluated various components. The software repository is available to the public.
|Number of pages||7|
|Journal||Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi)|
|Publication status||Published - 1999 Jan 1|
- Large vocabulary continuous speech recognition
ASJC Scopus subject areas
- Acoustics and Ultrasonics