Efficient learning for spoken language understanding tasks with word embedding based pre-training

Yi Luan, Shinji Watanabe, Bret Harsham

Research output: Contribution to journalConference articlepeer-review

15 Citations (Scopus)

Abstract

Spoken language understanding (SLU) tasks such as goal estimation and intention identification from user's commands are essential components in spoken dialog systems. In recent years, neural network approaches have shown great success in various SLU tasks. However, one major difficulty of SLU is that the annotation of collected data can be expensive. Often this results in insufficient data being available for a task. The performance of a neural network trained in low resource conditions is usually inferior because of over-training. To improve the performance, this paper investigates the use of unsupervised training methods with large-scale corpora based on word embedding and latent topic models to pre-train the SLU networks. In order to capture long-term characteristics over the entire dialog, we propose a novel Recurrent Neural Network (RNN) architecture. The proposed RNN uses two sub-networks to model the different time scales represented by word and turn sequences. The combination of pre-training and RNN gives us a 18% relative error reduction compared to a baseline system.

Original languageEnglish
Pages (from-to)1398-1402
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2015-January
Publication statusPublished - 2015 Jan 1
Externally publishedYes
Event16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015 - Dresden, Germany
Duration: 2015 Sept 62015 Sept 10

Keywords

  • Fine-tuning
  • Goal estimation
  • Recurrent neural networks
  • Semantic embedding
  • Spoken language understanding

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Efficient learning for spoken language understanding tasks with word embedding based pre-training'. Together they form a unique fingerprint.

Cite this