Analysis and improvement of HITS algorithm for detecting Web communities

S. Nomura, S. Oyama, T. Hayamizu, T. Ishida

研究成果: Conference contribution

10 被引用数 (Scopus)

抄録

We discuss problems with the HITS (Hyperlink-Induced Topic Search) algorithm, which capitalizes on hyperlinks to extract topic-bound communities of Web pages. Despite its theoretically sound foundations, we observed that the HITS algorithm has failed in real applications. In order to understand this problem, we developed a visualization tool LinkViewer, which graphically presents the extraction process. This tool helped reveal that a large and densely linked set of unrelated Web pages in the base set impeded the extraction. These pages were obtained when the root set was expanded into the base set. As a remedy to this topic drift problem, prior studies applied a textual analysis method. We propose two methods which only utilize the structural information of the Web: 1) the projection method, which projects eigenvectors on the root subspace, so that most elements in the root set will be relevant to the original topic; and 2) the base-set downsizing method, which filters out the pages without links to multiple pages in the root set. These methods are shown to be robust for broader types of topic and low in computation cost.

本文言語English
ホスト出版物のタイトルProceedings - 2002 Symposium on Applications and the Internet, SAINT 2002
出版社Institute of Electrical and Electronics Engineers Inc.
ページ132-140
ページ数9
ISBN(電子版)0769514472, 9780769514475
DOI
出版ステータスPublished - 2002 1 1
外部発表はい
イベントSymposium on Applications and the Internet, SAINT 2002 - Nara City, Japan
継続期間: 2002 1 282002 2 1

出版物シリーズ

名前Proceedings - 2002 Symposium on Applications and the Internet, SAINT 2002

Other

OtherSymposium on Applications and the Internet, SAINT 2002
CountryJapan
CityNara City
Period02/1/2802/2/1

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

フィンガープリント 「Analysis and improvement of HITS algorithm for detecting Web communities」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル