Automatic allocation of training data for speech understanding based on multiple model combinations

Kazunori Komatani, Mikio Nakano, Masaki Katsumaru, Kotaro Funakoshi, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Contribution to journalArticle

Abstract

The optimal way to build speech understanding modules depends on the amount of training data available. When only a small amount of training data is available, effective allocation of the data is crucial to preventing overfitting of statistical methods. We have developed a method for allocating a limited amount of training data in accordance with the amount available. Our method exploits rule-based methods for when the amount of data is small, which are included in our speech understanding framework based on multiple model combinations, i.e., multiple automatic speech recognition (ASR) modules and multiple language understanding (LU) modules, and then allocates training data preferentially to the modules that dominate the overall performance of speech understanding. Experimental evaluation showed that our allocation method consistently outperforms baseline methods that use a single ASR module and a single LU module while the amount of training data increases.

Original languageEnglish
Pages (from-to)2298-2307
Number of pages10
JournalIEICE Transactions on Information and Systems
VolumeE95-D
Issue number9
DOIs
Publication statusPublished - 2012 Sep
Externally publishedYes

Fingerprint

Speech recognition
Statistical methods

Keywords

  • Language understanding
  • Limited amount of training data
  • Rapid prototyping
  • Spoken dialogue system

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Software
  • Artificial Intelligence
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition

Cite this

Automatic allocation of training data for speech understanding based on multiple model combinations. / Komatani, Kazunori; Nakano, Mikio; Katsumaru, Masaki; Funakoshi, Kotaro; Ogata, Tetsuya; Okuno, Hiroshi G.

In: IEICE Transactions on Information and Systems, Vol. E95-D, No. 9, 09.2012, p. 2298-2307.

Research output: Contribution to journalArticle

Komatani, Kazunori ; Nakano, Mikio ; Katsumaru, Masaki ; Funakoshi, Kotaro ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Automatic allocation of training data for speech understanding based on multiple model combinations. In: IEICE Transactions on Information and Systems. 2012 ; Vol. E95-D, No. 9. pp. 2298-2307.
@article{467a401e52c641289f9b0da351cb8851,
title = "Automatic allocation of training data for speech understanding based on multiple model combinations",
abstract = "The optimal way to build speech understanding modules depends on the amount of training data available. When only a small amount of training data is available, effective allocation of the data is crucial to preventing overfitting of statistical methods. We have developed a method for allocating a limited amount of training data in accordance with the amount available. Our method exploits rule-based methods for when the amount of data is small, which are included in our speech understanding framework based on multiple model combinations, i.e., multiple automatic speech recognition (ASR) modules and multiple language understanding (LU) modules, and then allocates training data preferentially to the modules that dominate the overall performance of speech understanding. Experimental evaluation showed that our allocation method consistently outperforms baseline methods that use a single ASR module and a single LU module while the amount of training data increases.",
keywords = "Language understanding, Limited amount of training data, Rapid prototyping, Spoken dialogue system",
author = "Kazunori Komatani and Mikio Nakano and Masaki Katsumaru and Kotaro Funakoshi and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2012",
month = "9",
doi = "10.1587/transinf.E95.D.2298",
language = "English",
volume = "E95-D",
pages = "2298--2307",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "9",

}

TY - JOUR

T1 - Automatic allocation of training data for speech understanding based on multiple model combinations

AU - Komatani, Kazunori

AU - Nakano, Mikio

AU - Katsumaru, Masaki

AU - Funakoshi, Kotaro

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2012/9

Y1 - 2012/9

N2 - The optimal way to build speech understanding modules depends on the amount of training data available. When only a small amount of training data is available, effective allocation of the data is crucial to preventing overfitting of statistical methods. We have developed a method for allocating a limited amount of training data in accordance with the amount available. Our method exploits rule-based methods for when the amount of data is small, which are included in our speech understanding framework based on multiple model combinations, i.e., multiple automatic speech recognition (ASR) modules and multiple language understanding (LU) modules, and then allocates training data preferentially to the modules that dominate the overall performance of speech understanding. Experimental evaluation showed that our allocation method consistently outperforms baseline methods that use a single ASR module and a single LU module while the amount of training data increases.

AB - The optimal way to build speech understanding modules depends on the amount of training data available. When only a small amount of training data is available, effective allocation of the data is crucial to preventing overfitting of statistical methods. We have developed a method for allocating a limited amount of training data in accordance with the amount available. Our method exploits rule-based methods for when the amount of data is small, which are included in our speech understanding framework based on multiple model combinations, i.e., multiple automatic speech recognition (ASR) modules and multiple language understanding (LU) modules, and then allocates training data preferentially to the modules that dominate the overall performance of speech understanding. Experimental evaluation showed that our allocation method consistently outperforms baseline methods that use a single ASR module and a single LU module while the amount of training data increases.

KW - Language understanding

KW - Limited amount of training data

KW - Rapid prototyping

KW - Spoken dialogue system

UR - http://www.scopus.com/inward/record.url?scp=84865710829&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84865710829&partnerID=8YFLogxK

U2 - 10.1587/transinf.E95.D.2298

DO - 10.1587/transinf.E95.D.2298

M3 - Article

AN - SCOPUS:84865710829

VL - E95-D

SP - 2298

EP - 2307

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 9

ER -