Temporal multi-page summarization

Adam Jatowt, Mitsuru Ishizuka

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

With the increasing popularity of the Web, efficient approaches to the information overload are becoming more necessary. Summarization of web pages aims at detecting the most important contents from pages so that a user can obtain a compact version of a web document or a group of pages. Traditionally, summaries are constructed on static snapshots of web pages. However, web pages are dynamic objects that can change their contents anytime. In this paper, we discuss the research on temporal multi-document summarization in the Web. We analyze the temporal contents of topically related collections of web pages monitored for certain time intervals. The contents derived from the temporal versions of web documents are summarized to provide information on hot topics and popular events in the collection. We propose two summarization methods that use changing and static contents of web pages downloaded at defined time intervals. The first uses a sliding window mechanism and the second is based on analyzing the time series of the document frequencies of terms. Additionally, we introduce a novel sentence selection algorithm designed for time-dependent scenarios such as temporal summarization.

Original languageEnglish
Pages (from-to)163-180
Number of pages18
JournalWeb Intelligence and Agent Systems
Volume4
Issue number2
Publication statusPublished - 2006
Externally publishedYes

Keywords

  • Change detection and relevance
  • Temporal web page analysis
  • Web collection
  • Web document summarization

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Temporal multi-page summarization'. Together they form a unique fingerprint.

Cite this