Key word extraction from a document using word co-occurrence statistical information

Yutaka Matsuo, Mitsuru Ishizuka

研究成果: Article査読

30 被引用数 (Scopus)

抄録

We present a new keyword extraction algorithm that applies to a single document without using a large corpus. Frequent terms are extracted first, then a set of co-occurrence between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. The distribution of co-occurrence shows the importance of a term in the document as follows. If the probability distribution of co-occurrence between term a and the frequent terms is biased to a particular subset of the frequent terms, then term a is likely to be a keyword. The degree of the biases of the distribution is measured by Χ 2-measure. We show our algorithm performs well for indexing technical papers.

本文言語English
ページ(範囲)217-223
ページ数7
ジャーナルTransactions of the Japanese Society for Artificial Intelligence
17
3
DOI
出版ステータスPublished - 2002
外部発表はい

ASJC Scopus subject areas

  • 人工知能

フィンガープリント

「Key word extraction from a document using word co-occurrence statistical information」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル