Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models

Masaki Katsumaru, Mikio Nakano, Kazunori Komatani, Kotaro Funakoshi, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We aim to improve a speech understanding module with a small amount of training data. A speech understanding module uses a language model (LM) and a language understanding model (LUM). A lot of training data are needed to improve the models. Such data collection is, however, difficult in an actual process of development. We therefore design and develop a new framework that uses multiple LMs and LUMs to improve speech understanding accuracy under various amounts of training data. Even if the amount of available training data is small, each LM and each LUM can deal well with different types of utterances and more utterances are understood by using multiple LM and LUM. As one implementation of the framework, we develop a method for selecting the most appropriate speech understanding result from several candidates. The selection is based on probabilities of correctness calculated by logistic regressions. We evaluate our framework with various amounts of training data.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Pages2735-2738
Number of pages4
Publication statusPublished - 2009
Externally publishedYes
Event10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009 - Brighton, United Kingdom
Duration: 2009 Sep 62009 Sep 10

Other

Other10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
CountryUnited Kingdom
CityBrighton
Period09/9/609/9/10

Fingerprint

Language
Logistic Models
Logistics

Keywords

  • Limited training data
  • Multiple language models and language understanding models
  • Speech understanding

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

Cite this

Katsumaru, M., Nakano, M., Komatani, K., Funakoshi, K., Ogata, T., & Okuno, H. G. (2009). Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 2735-2738)

Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models. / Katsumaru, Masaki; Nakano, Mikio; Komatani, Kazunori; Funakoshi, Kotaro; Ogata, Tetsuya; Okuno, Hiroshi G.

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. p. 2735-2738.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Katsumaru, M, Nakano, M, Komatani, K, Funakoshi, K, Ogata, T & Okuno, HG 2009, Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models. in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. pp. 2735-2738, 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, Brighton, United Kingdom, 09/9/6.
Katsumaru M, Nakano M, Komatani K, Funakoshi K, Ogata T, Okuno HG. Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. p. 2735-2738
Katsumaru, Masaki ; Nakano, Mikio ; Komatani, Kazunori ; Funakoshi, Kotaro ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. pp. 2735-2738
@inproceedings{c44d8b438446451bb437edf9124050fb,
title = "Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models",
abstract = "We aim to improve a speech understanding module with a small amount of training data. A speech understanding module uses a language model (LM) and a language understanding model (LUM). A lot of training data are needed to improve the models. Such data collection is, however, difficult in an actual process of development. We therefore design and develop a new framework that uses multiple LMs and LUMs to improve speech understanding accuracy under various amounts of training data. Even if the amount of available training data is small, each LM and each LUM can deal well with different types of utterances and more utterances are understood by using multiple LM and LUM. As one implementation of the framework, we develop a method for selecting the most appropriate speech understanding result from several candidates. The selection is based on probabilities of correctness calculated by logistic regressions. We evaluate our framework with various amounts of training data.",
keywords = "Limited training data, Multiple language models and language understanding models, Speech understanding",
author = "Masaki Katsumaru and Mikio Nakano and Kazunori Komatani and Kotaro Funakoshi and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2009",
language = "English",
pages = "2735--2738",
booktitle = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

}

TY - GEN

T1 - Improving speech understanding accuracy with limited training data using multiple language models and multiple understanding models

AU - Katsumaru, Masaki

AU - Nakano, Mikio

AU - Komatani, Kazunori

AU - Funakoshi, Kotaro

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2009

Y1 - 2009

N2 - We aim to improve a speech understanding module with a small amount of training data. A speech understanding module uses a language model (LM) and a language understanding model (LUM). A lot of training data are needed to improve the models. Such data collection is, however, difficult in an actual process of development. We therefore design and develop a new framework that uses multiple LMs and LUMs to improve speech understanding accuracy under various amounts of training data. Even if the amount of available training data is small, each LM and each LUM can deal well with different types of utterances and more utterances are understood by using multiple LM and LUM. As one implementation of the framework, we develop a method for selecting the most appropriate speech understanding result from several candidates. The selection is based on probabilities of correctness calculated by logistic regressions. We evaluate our framework with various amounts of training data.

AB - We aim to improve a speech understanding module with a small amount of training data. A speech understanding module uses a language model (LM) and a language understanding model (LUM). A lot of training data are needed to improve the models. Such data collection is, however, difficult in an actual process of development. We therefore design and develop a new framework that uses multiple LMs and LUMs to improve speech understanding accuracy under various amounts of training data. Even if the amount of available training data is small, each LM and each LUM can deal well with different types of utterances and more utterances are understood by using multiple LM and LUM. As one implementation of the framework, we develop a method for selecting the most appropriate speech understanding result from several candidates. The selection is based on probabilities of correctness calculated by logistic regressions. We evaluate our framework with various amounts of training data.

KW - Limited training data

KW - Multiple language models and language understanding models

KW - Speech understanding

UR - http://www.scopus.com/inward/record.url?scp=70450218193&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70450218193&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:70450218193

SP - 2735

EP - 2738

BT - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

ER -