Spontaneous dialogue speech recognition using cross-word context constrained word graphs

Tohru Shimizu, Hirofumi Yamamoto, Hirokazu Masataki, Shoichi Matsunaga, Yoshinori Sagisaka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Citations (Scopus)

Abstract

This paper proposes a large vocabulary spontaneous dialogue speech recognizer using cross-word context constrained word graphs. In this method, two approximation methods 'cross-word context approximation' and 'lenient language score smearing' are introduced to reduce the computational cost for word graph generation. The experimental results using a 'travel arrangement corpus' show that this recognition method achieves a word hypotheses reduction of 25-40% and a cpu-time reduction of 30-60% compared to without approximation, and that the use of class bigram scores as the expected language score for each lexicon tree node decreases the word error rate 25-30% compared to without approximation.

Original languageEnglish
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
PublisherIEEE
Pages145-148
Number of pages4
Volume1
Publication statusPublished - 1996
Externally publishedYes
EventProceedings of the 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 6) - Atlanta, GA, USA
Duration: 1996 May 71996 May 10

Other

OtherProceedings of the 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 6)
CityAtlanta, GA, USA
Period96/5/796/5/10

Fingerprint

speech recognition
Speech recognition
approximation
Costs
travel
costs

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

Shimizu, T., Yamamoto, H., Masataki, H., Matsunaga, S., & Sagisaka, Y. (1996). Spontaneous dialogue speech recognition using cross-word context constrained word graphs. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 1, pp. 145-148). IEEE.

Spontaneous dialogue speech recognition using cross-word context constrained word graphs. / Shimizu, Tohru; Yamamoto, Hirofumi; Masataki, Hirokazu; Matsunaga, Shoichi; Sagisaka, Yoshinori.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 IEEE, 1996. p. 145-148.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shimizu, T, Yamamoto, H, Masataki, H, Matsunaga, S & Sagisaka, Y 1996, Spontaneous dialogue speech recognition using cross-word context constrained word graphs. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 1, IEEE, pp. 145-148, Proceedings of the 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 6), Atlanta, GA, USA, 96/5/7.
Shimizu T, Yamamoto H, Masataki H, Matsunaga S, Sagisaka Y. Spontaneous dialogue speech recognition using cross-word context constrained word graphs. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1. IEEE. 1996. p. 145-148
Shimizu, Tohru ; Yamamoto, Hirofumi ; Masataki, Hirokazu ; Matsunaga, Shoichi ; Sagisaka, Yoshinori. / Spontaneous dialogue speech recognition using cross-word context constrained word graphs. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 IEEE, 1996. pp. 145-148
@inproceedings{5566fb95c0ef4721b30b3e47231a384e,
title = "Spontaneous dialogue speech recognition using cross-word context constrained word graphs",
abstract = "This paper proposes a large vocabulary spontaneous dialogue speech recognizer using cross-word context constrained word graphs. In this method, two approximation methods 'cross-word context approximation' and 'lenient language score smearing' are introduced to reduce the computational cost for word graph generation. The experimental results using a 'travel arrangement corpus' show that this recognition method achieves a word hypotheses reduction of 25-40{\%} and a cpu-time reduction of 30-60{\%} compared to without approximation, and that the use of class bigram scores as the expected language score for each lexicon tree node decreases the word error rate 25-30{\%} compared to without approximation.",
author = "Tohru Shimizu and Hirofumi Yamamoto and Hirokazu Masataki and Shoichi Matsunaga and Yoshinori Sagisaka",
year = "1996",
language = "English",
volume = "1",
pages = "145--148",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "IEEE",

}

TY - GEN

T1 - Spontaneous dialogue speech recognition using cross-word context constrained word graphs

AU - Shimizu, Tohru

AU - Yamamoto, Hirofumi

AU - Masataki, Hirokazu

AU - Matsunaga, Shoichi

AU - Sagisaka, Yoshinori

PY - 1996

Y1 - 1996

N2 - This paper proposes a large vocabulary spontaneous dialogue speech recognizer using cross-word context constrained word graphs. In this method, two approximation methods 'cross-word context approximation' and 'lenient language score smearing' are introduced to reduce the computational cost for word graph generation. The experimental results using a 'travel arrangement corpus' show that this recognition method achieves a word hypotheses reduction of 25-40% and a cpu-time reduction of 30-60% compared to without approximation, and that the use of class bigram scores as the expected language score for each lexicon tree node decreases the word error rate 25-30% compared to without approximation.

AB - This paper proposes a large vocabulary spontaneous dialogue speech recognizer using cross-word context constrained word graphs. In this method, two approximation methods 'cross-word context approximation' and 'lenient language score smearing' are introduced to reduce the computational cost for word graph generation. The experimental results using a 'travel arrangement corpus' show that this recognition method achieves a word hypotheses reduction of 25-40% and a cpu-time reduction of 30-60% compared to without approximation, and that the use of class bigram scores as the expected language score for each lexicon tree node decreases the word error rate 25-30% compared to without approximation.

UR - http://www.scopus.com/inward/record.url?scp=0029765807&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0029765807&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0029765807

VL - 1

SP - 145

EP - 148

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

PB - IEEE

ER -