Language modeling using patterns extracted from parse trees for speech recognition

Takatoshi Jitsuhiro, Hirofumi Yamamoto, Setsuo Yamada, Genichiro Kikui, Yoshinori Sagisaka

Research output: Contribution to journalArticle

Abstract

We propose new language models to represent phrasal structures by patterns extracted from parse trees. First, modified word trigram models are proposed. They are extracted from sentences analyzed by the preprocessing of the parser with knowledge. Since sentences are analyzed to create sub-trees of a few words, these trigram models can represent relations among a few neighbor words more strongly than conventional word trigram models. Second, word pattern models are used on these modified word trigram models. The word patterns are extracted from parse trees and can represent phrasal structures and much longer word-dependency than trigram models. Experimental results show that modified trigram models are more effective than traditional trigram models and that pattern models attain slight improvements over modified trigram models. Furthermore, additional experiments show that pattern models are more effective for long sentences.

Original languageEnglish
Pages (from-to)446-453
Number of pages8
JournalIEICE Transactions on Information and Systems
VolumeE86-D
Issue number3
Publication statusPublished - 2003 Mar
Externally publishedYes

Fingerprint

Speech recognition

Keywords

  • Language model
  • N-gram model
  • Parser
  • Pattern model
  • Speech recognition

ASJC Scopus subject areas

  • Information Systems
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

Language modeling using patterns extracted from parse trees for speech recognition. / Jitsuhiro, Takatoshi; Yamamoto, Hirofumi; Yamada, Setsuo; Kikui, Genichiro; Sagisaka, Yoshinori.

In: IEICE Transactions on Information and Systems, Vol. E86-D, No. 3, 03.2003, p. 446-453.

Research output: Contribution to journalArticle

Jitsuhiro, Takatoshi ; Yamamoto, Hirofumi ; Yamada, Setsuo ; Kikui, Genichiro ; Sagisaka, Yoshinori. / Language modeling using patterns extracted from parse trees for speech recognition. In: IEICE Transactions on Information and Systems. 2003 ; Vol. E86-D, No. 3. pp. 446-453.
@article{e1cca278759240a5a70f05bb1224d1a8,
title = "Language modeling using patterns extracted from parse trees for speech recognition",
abstract = "We propose new language models to represent phrasal structures by patterns extracted from parse trees. First, modified word trigram models are proposed. They are extracted from sentences analyzed by the preprocessing of the parser with knowledge. Since sentences are analyzed to create sub-trees of a few words, these trigram models can represent relations among a few neighbor words more strongly than conventional word trigram models. Second, word pattern models are used on these modified word trigram models. The word patterns are extracted from parse trees and can represent phrasal structures and much longer word-dependency than trigram models. Experimental results show that modified trigram models are more effective than traditional trigram models and that pattern models attain slight improvements over modified trigram models. Furthermore, additional experiments show that pattern models are more effective for long sentences.",
keywords = "Language model, N-gram model, Parser, Pattern model, Speech recognition",
author = "Takatoshi Jitsuhiro and Hirofumi Yamamoto and Setsuo Yamada and Genichiro Kikui and Yoshinori Sagisaka",
year = "2003",
month = "3",
language = "English",
volume = "E86-D",
pages = "446--453",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "3",

}

TY - JOUR

T1 - Language modeling using patterns extracted from parse trees for speech recognition

AU - Jitsuhiro, Takatoshi

AU - Yamamoto, Hirofumi

AU - Yamada, Setsuo

AU - Kikui, Genichiro

AU - Sagisaka, Yoshinori

PY - 2003/3

Y1 - 2003/3

N2 - We propose new language models to represent phrasal structures by patterns extracted from parse trees. First, modified word trigram models are proposed. They are extracted from sentences analyzed by the preprocessing of the parser with knowledge. Since sentences are analyzed to create sub-trees of a few words, these trigram models can represent relations among a few neighbor words more strongly than conventional word trigram models. Second, word pattern models are used on these modified word trigram models. The word patterns are extracted from parse trees and can represent phrasal structures and much longer word-dependency than trigram models. Experimental results show that modified trigram models are more effective than traditional trigram models and that pattern models attain slight improvements over modified trigram models. Furthermore, additional experiments show that pattern models are more effective for long sentences.

AB - We propose new language models to represent phrasal structures by patterns extracted from parse trees. First, modified word trigram models are proposed. They are extracted from sentences analyzed by the preprocessing of the parser with knowledge. Since sentences are analyzed to create sub-trees of a few words, these trigram models can represent relations among a few neighbor words more strongly than conventional word trigram models. Second, word pattern models are used on these modified word trigram models. The word patterns are extracted from parse trees and can represent phrasal structures and much longer word-dependency than trigram models. Experimental results show that modified trigram models are more effective than traditional trigram models and that pattern models attain slight improvements over modified trigram models. Furthermore, additional experiments show that pattern models are more effective for long sentences.

KW - Language model

KW - N-gram model

KW - Parser

KW - Pattern model

KW - Speech recognition

UR - http://www.scopus.com/inward/record.url?scp=0038719973&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0038719973&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0038719973

VL - E86-D

SP - 446

EP - 453

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 3

ER -