Temporal multi-page summarization

Adam Jatowt, Mitsuru Ishizuka

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

With the increasing popularity of the Web, efficient approaches to the information overload are becoming more necessary. Summarization of web pages aims at detecting the most important contents from pages so that a user can obtain a compact version of a web document or a group of pages. Traditionally, summaries are constructed on static snapshots of web pages. However, web pages are dynamic objects that can change their contents anytime. In this paper, we discuss the research on temporal multi-document summarization in the Web. We analyze the temporal contents of topically related collections of web pages monitored for certain time intervals. The contents derived from the temporal versions of web documents are summarized to provide information on hot topics and popular events in the collection. We propose two summarization methods that use changing and static contents of web pages downloaded at defined time intervals. The first uses a sliding window mechanism and the second is based on analyzing the time series of the document frequencies of terms. Additionally, we introduce a novel sentence selection algorithm designed for time-dependent scenarios such as temporal summarization.

Original languageEnglish
Pages (from-to)163-180
Number of pages18
JournalWeb Intelligence and Agent Systems
Volume4
Issue number2
Publication statusPublished - 2006
Externally publishedYes

Fingerprint

Websites
Time series

Keywords

  • Change detection and relevance
  • Temporal web page analysis
  • Web collection
  • Web document summarization

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computational Theory and Mathematics

Cite this

Jatowt, A., & Ishizuka, M. (2006). Temporal multi-page summarization. Web Intelligence and Agent Systems, 4(2), 163-180.

Temporal multi-page summarization. / Jatowt, Adam; Ishizuka, Mitsuru.

In: Web Intelligence and Agent Systems, Vol. 4, No. 2, 2006, p. 163-180.

Research output: Contribution to journalArticle

Jatowt, A & Ishizuka, M 2006, 'Temporal multi-page summarization', Web Intelligence and Agent Systems, vol. 4, no. 2, pp. 163-180.
Jatowt, Adam ; Ishizuka, Mitsuru. / Temporal multi-page summarization. In: Web Intelligence and Agent Systems. 2006 ; Vol. 4, No. 2. pp. 163-180.
@article{162328dafb5046ba9723be7ffc0d774b,
title = "Temporal multi-page summarization",
abstract = "With the increasing popularity of the Web, efficient approaches to the information overload are becoming more necessary. Summarization of web pages aims at detecting the most important contents from pages so that a user can obtain a compact version of a web document or a group of pages. Traditionally, summaries are constructed on static snapshots of web pages. However, web pages are dynamic objects that can change their contents anytime. In this paper, we discuss the research on temporal multi-document summarization in the Web. We analyze the temporal contents of topically related collections of web pages monitored for certain time intervals. The contents derived from the temporal versions of web documents are summarized to provide information on hot topics and popular events in the collection. We propose two summarization methods that use changing and static contents of web pages downloaded at defined time intervals. The first uses a sliding window mechanism and the second is based on analyzing the time series of the document frequencies of terms. Additionally, we introduce a novel sentence selection algorithm designed for time-dependent scenarios such as temporal summarization.",
keywords = "Change detection and relevance, Temporal web page analysis, Web collection, Web document summarization",
author = "Adam Jatowt and Mitsuru Ishizuka",
year = "2006",
language = "English",
volume = "4",
pages = "163--180",
journal = "Web Intelligence",
issn = "2405-6456",
publisher = "IOS Press",
number = "2",

}

TY - JOUR

T1 - Temporal multi-page summarization

AU - Jatowt, Adam

AU - Ishizuka, Mitsuru

PY - 2006

Y1 - 2006

N2 - With the increasing popularity of the Web, efficient approaches to the information overload are becoming more necessary. Summarization of web pages aims at detecting the most important contents from pages so that a user can obtain a compact version of a web document or a group of pages. Traditionally, summaries are constructed on static snapshots of web pages. However, web pages are dynamic objects that can change their contents anytime. In this paper, we discuss the research on temporal multi-document summarization in the Web. We analyze the temporal contents of topically related collections of web pages monitored for certain time intervals. The contents derived from the temporal versions of web documents are summarized to provide information on hot topics and popular events in the collection. We propose two summarization methods that use changing and static contents of web pages downloaded at defined time intervals. The first uses a sliding window mechanism and the second is based on analyzing the time series of the document frequencies of terms. Additionally, we introduce a novel sentence selection algorithm designed for time-dependent scenarios such as temporal summarization.

AB - With the increasing popularity of the Web, efficient approaches to the information overload are becoming more necessary. Summarization of web pages aims at detecting the most important contents from pages so that a user can obtain a compact version of a web document or a group of pages. Traditionally, summaries are constructed on static snapshots of web pages. However, web pages are dynamic objects that can change their contents anytime. In this paper, we discuss the research on temporal multi-document summarization in the Web. We analyze the temporal contents of topically related collections of web pages monitored for certain time intervals. The contents derived from the temporal versions of web documents are summarized to provide information on hot topics and popular events in the collection. We propose two summarization methods that use changing and static contents of web pages downloaded at defined time intervals. The first uses a sliding window mechanism and the second is based on analyzing the time series of the document frequencies of terms. Additionally, we introduce a novel sentence selection algorithm designed for time-dependent scenarios such as temporal summarization.

KW - Change detection and relevance

KW - Temporal web page analysis

KW - Web collection

KW - Web document summarization

UR - http://www.scopus.com/inward/record.url?scp=33744738909&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33744738909&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:33744738909

VL - 4

SP - 163

EP - 180

JO - Web Intelligence

JF - Web Intelligence

SN - 2405-6456

IS - 2

ER -