Relation extraction from Wikipedia using subtree mining

Dat P T Nguyen*, Yutaka Matsuo, Mitsuru Ishizuka

*この研究の対応する著者

研究成果: Conference contribution

77 被引用数 (Scopus)

抄録

The exponential growth and reliability of Wikipedia have made it a promising data source for intelligent systems. The first challenge of Wikipedia is to make the encyclopedia machine-processable. In this study, we address the problem of extracting relations among entities from Wikipedia's English articles, which in turn can serve for intelligent systems to satisfy users' information needs. Our proposed method first anchors the appearance of entities in Wikipedia articles using some heuristic rules that are supported by their encyclopedic style. Therefore, it uses neither the Named Entity Recognizer (NER) nor the Coreference Resolution tool, which are sources of errors for relation extraction. It then classifies the relationships among entity pairs using SVM with features extracted from the web structure and subtrees mined from the syntactic structure of text. The innovations behind our work are the following: a) our method makes use of Wikipedia characteristics for entity allocation and entity classification, which are essential for relation extraction; b) our algorithm extracts a core tree, which accurately reflects a relationship between a given entity pair, and subsequently identifies key features with respect to the relationship from the core tree. We demonstrate the effectiveness of our approach through evaluation of manually annotated data from actual Wikipedia articles.

本文言語English
ホスト出版物のタイトルProceedings of the National Conference on Artificial Intelligence
ページ1414-1420
ページ数7
2
出版ステータスPublished - 2007
外部発表はい
イベントAAAI-07/IAAI-07 Proceedings: 22nd AAAI Conference on Artificial Intelligence and the 19th Innovative Applications of Artificial Intelligence Conference - Vancouver, BC
継続期間: 2007 7月 222007 7月 26

Other

OtherAAAI-07/IAAI-07 Proceedings: 22nd AAAI Conference on Artificial Intelligence and the 19th Innovative Applications of Artificial Intelligence Conference
CityVancouver, BC
Period07/7/2207/7/26

ASJC Scopus subject areas

  • ソフトウェア

フィンガープリント

「Relation extraction from Wikipedia using subtree mining」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル