Japanese Dictation Toolkit -1997 version

Tatsuya Kawahara, Akinobu Lee, Tetsunori Kobayashi, Kazuya Takeda, Nobuaki Minematsu, Katsunobu Itou, Akinori Ito, Mikio Yamamoto, Atsushi Yamada, Takehito Utsuro, Kiyohiro Shikano

    Research output: Contribution to journalArticle

    27 Citations (Scopus)

    Abstract

    The Japanese Dictation Toolkit has been designed and developed as a baseline platform for Japanese LVCSR (Large Vocabulary Continuous Speech Recognition). The platform consists of a standard recognition engine, Japanese phone models and Japanese statistical language models. We set up a variety of Japanese phone HMMs from a context-independent monophone to a triphone model of thousands of states. They are trained with ASJ (The Acoustical Society of Japan) databases. A lexicon and word N-gram (2-gram and 3-gram) models are constructed with a corpus of Mainichi newspaper. The recognition engine JULIUS is developed for evaluation of both acoustic and language models. As an integrated system of these modules, we have implemented a baseline 5,000-word dictation system and evaluated various components. The software repository is available to the public.

    Original languageEnglish
    Pages (from-to)233-239
    Number of pages7
    JournalJournal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi)
    Volume20
    Issue number3
    DOIs
    Publication statusPublished - 1999

    Fingerprint

    engines
    platforms
    speech recognition
    Japan
    modules
    computer programs
    acoustics
    evaluation

    Keywords

    • Large vocabulary continuous speech recognition
    • Software

    ASJC Scopus subject areas

    • Acoustics and Ultrasonics

    Cite this

    Japanese Dictation Toolkit -1997 version. / Kawahara, Tatsuya; Lee, Akinobu; Kobayashi, Tetsunori; Takeda, Kazuya; Minematsu, Nobuaki; Itou, Katsunobu; Ito, Akinori; Yamamoto, Mikio; Yamada, Atsushi; Utsuro, Takehito; Shikano, Kiyohiro.

    In: Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), Vol. 20, No. 3, 1999, p. 233-239.

    Research output: Contribution to journalArticle

    Kawahara, T, Lee, A, Kobayashi, T, Takeda, K, Minematsu, N, Itou, K, Ito, A, Yamamoto, M, Yamada, A, Utsuro, T & Shikano, K 1999, 'Japanese Dictation Toolkit -1997 version', Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), vol. 20, no. 3, pp. 233-239. https://doi.org/10.1250/ast.20.233
    Kawahara, Tatsuya ; Lee, Akinobu ; Kobayashi, Tetsunori ; Takeda, Kazuya ; Minematsu, Nobuaki ; Itou, Katsunobu ; Ito, Akinori ; Yamamoto, Mikio ; Yamada, Atsushi ; Utsuro, Takehito ; Shikano, Kiyohiro. / Japanese Dictation Toolkit -1997 version. In: Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi). 1999 ; Vol. 20, No. 3. pp. 233-239.
    @article{44c9f4fbe0b84d2484192bdec27eb80a,
    title = "Japanese Dictation Toolkit -1997 version",
    abstract = "The Japanese Dictation Toolkit has been designed and developed as a baseline platform for Japanese LVCSR (Large Vocabulary Continuous Speech Recognition). The platform consists of a standard recognition engine, Japanese phone models and Japanese statistical language models. We set up a variety of Japanese phone HMMs from a context-independent monophone to a triphone model of thousands of states. They are trained with ASJ (The Acoustical Society of Japan) databases. A lexicon and word N-gram (2-gram and 3-gram) models are constructed with a corpus of Mainichi newspaper. The recognition engine JULIUS is developed for evaluation of both acoustic and language models. As an integrated system of these modules, we have implemented a baseline 5,000-word dictation system and evaluated various components. The software repository is available to the public.",
    keywords = "Large vocabulary continuous speech recognition, Software",
    author = "Tatsuya Kawahara and Akinobu Lee and Tetsunori Kobayashi and Kazuya Takeda and Nobuaki Minematsu and Katsunobu Itou and Akinori Ito and Mikio Yamamoto and Atsushi Yamada and Takehito Utsuro and Kiyohiro Shikano",
    year = "1999",
    doi = "10.1250/ast.20.233",
    language = "English",
    volume = "20",
    pages = "233--239",
    journal = "Acoustical Science and Technology",
    issn = "1346-3969",
    publisher = "Acoustical Society of Japan",
    number = "3",

    }

    TY - JOUR

    T1 - Japanese Dictation Toolkit -1997 version

    AU - Kawahara, Tatsuya

    AU - Lee, Akinobu

    AU - Kobayashi, Tetsunori

    AU - Takeda, Kazuya

    AU - Minematsu, Nobuaki

    AU - Itou, Katsunobu

    AU - Ito, Akinori

    AU - Yamamoto, Mikio

    AU - Yamada, Atsushi

    AU - Utsuro, Takehito

    AU - Shikano, Kiyohiro

    PY - 1999

    Y1 - 1999

    N2 - The Japanese Dictation Toolkit has been designed and developed as a baseline platform for Japanese LVCSR (Large Vocabulary Continuous Speech Recognition). The platform consists of a standard recognition engine, Japanese phone models and Japanese statistical language models. We set up a variety of Japanese phone HMMs from a context-independent monophone to a triphone model of thousands of states. They are trained with ASJ (The Acoustical Society of Japan) databases. A lexicon and word N-gram (2-gram and 3-gram) models are constructed with a corpus of Mainichi newspaper. The recognition engine JULIUS is developed for evaluation of both acoustic and language models. As an integrated system of these modules, we have implemented a baseline 5,000-word dictation system and evaluated various components. The software repository is available to the public.

    AB - The Japanese Dictation Toolkit has been designed and developed as a baseline platform for Japanese LVCSR (Large Vocabulary Continuous Speech Recognition). The platform consists of a standard recognition engine, Japanese phone models and Japanese statistical language models. We set up a variety of Japanese phone HMMs from a context-independent monophone to a triphone model of thousands of states. They are trained with ASJ (The Acoustical Society of Japan) databases. A lexicon and word N-gram (2-gram and 3-gram) models are constructed with a corpus of Mainichi newspaper. The recognition engine JULIUS is developed for evaluation of both acoustic and language models. As an integrated system of these modules, we have implemented a baseline 5,000-word dictation system and evaluated various components. The software repository is available to the public.

    KW - Large vocabulary continuous speech recognition

    KW - Software

    UR - http://www.scopus.com/inward/record.url?scp=0000698482&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=0000698482&partnerID=8YFLogxK

    U2 - 10.1250/ast.20.233

    DO - 10.1250/ast.20.233

    M3 - Article

    VL - 20

    SP - 233

    EP - 239

    JO - Acoustical Science and Technology

    JF - Acoustical Science and Technology

    SN - 1346-3969

    IS - 3

    ER -