Integrating multiple internet directories by instance-based learning

Ryutaro Ichise, Hiedeaki Takeda, Shinichi Honiden

研究成果: Conference article査読

44 被引用数 (Scopus)

抄録

Finding desired information on the Internet is becoming increasingly difficult. Internet directories such as Yahoo!, which organize web pages into hierarchical categories, provide one solution to this problem; however, such directories are of limited use because some bias is applied both in the collection and categorization of pages. We propose a method for integrating multiple Internet directories by instance-based learning. Our method provides the mapping of categories in order to transfer documents from one directory to another, instead of simply merging two directories into one. We present herein an effective algorithm for determining similar categories between two directories via a statistical method called the k-statistic. In order to evaluate the proposed method, we conducted experiments using two actual Internet directories, Yahoo! and Google. The results show that the proposed method achieves extensive improvements relative to both the Naive Bayes and Enhanced Naive Bayes approaches, without any text analysis on documents.

本文言語English
ページ(範囲)22-28
ページ数7
ジャーナルIJCAI International Joint Conference on Artificial Intelligence
出版ステータスPublished - 2003 12月 1
外部発表はい
イベント18th International Joint Conference on Artificial Intelligence, IJCAI 2003 - Acapulco, Mexico
継続期間: 2003 8月 92003 8月 15

ASJC Scopus subject areas

  • 人工知能

フィンガープリント

「Integrating multiple internet directories by instance-based learning」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル