The effect of corpus size on case frame acquisition for predicate-argument structure analysis

Ryohei Sasano, Daisuke Kawahara, Sadao Kurohashi

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

This paper reports the effect of corpus size on case frame acquisition for predicate-argument structure analysis in Japanese. For this study, we collect a Japanese corpus consisting of up to 100 billion words, and construct case frames from corpora of six different sizes. Then, we apply these case frames to syntactic and case structure analysis, and zero anaphora resolution, in order to investigate the relationship between the corpus size for case frame acquisition and the performance of predicateargument structure analysis. We obtained better analyses by using case frames constructed from larger corpora; the performance was not saturated even with a corpus size of 100 billion words.

Original languageEnglish
Pages (from-to)1361-1368
Number of pages8
JournalIEICE Transactions on Information and Systems
VolumeE93-D
Issue number6
DOIs
Publication statusPublished - 2010 Jun
Externally publishedYes

Keywords

  • Case frame
  • Corpus size
  • Predicate-argument structure analysis

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'The effect of corpus size on case frame acquisition for predicate-argument structure analysis'. Together they form a unique fingerprint.

Cite this