A bottom-up approach to sentence ordering for multi-document summarization

Danushka Bollegala, Naoaki Okazaki, Mitsuru Ishizuka

Research output: Contribution to journalArticle

27 Citations (Scopus)

Abstract

Ordering information is a difficult but important task for applications generating natural language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected from a set of source documents. However, improper ordering of information in a summary can confuse the reader and deteriorate the readability of the summary. Therefore, it is vital to properly order the information in multi-document summarization. We present a bottom-up approach to arrange sentences extracted for multi-document summarization. To capture the association and order of two textual segments (e.g. sentences), we define four criteria: chronology, topical-closeness, precedence, and succession. These criteria are integrated into a criterion by a supervised learning approach. We repeatedly concatenate two textual segments into one segment based on the criterion, until we obtain the overall segment with all sentences arranged. We evaluate the sentence orderings produced by the proposed method and numerous baselines using subjective gradings as well as automatic evaluation measures. We introduce the average continuity, an automatic evaluation measure of sentence ordering in a summary, and investigate its appropriateness for this task.

Original languageEnglish
Pages (from-to)89-109
Number of pages21
JournalInformation Processing and Management
Volume46
Issue number1
DOIs
Publication statusPublished - 2010 Jan
Externally publishedYes

Fingerprint

Supervised learning
grading
evaluation
continuity
Bottom-up
Summarization
language
learning
Evaluation

Keywords

  • Multi-document summarization
  • Natural language processing
  • Sentence ordering

ASJC Scopus subject areas

  • Media Technology
  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences
  • Management Science and Operations Research

Cite this

A bottom-up approach to sentence ordering for multi-document summarization. / Bollegala, Danushka; Okazaki, Naoaki; Ishizuka, Mitsuru.

In: Information Processing and Management, Vol. 46, No. 1, 01.2010, p. 89-109.

Research output: Contribution to journalArticle

Bollegala, Danushka ; Okazaki, Naoaki ; Ishizuka, Mitsuru. / A bottom-up approach to sentence ordering for multi-document summarization. In: Information Processing and Management. 2010 ; Vol. 46, No. 1. pp. 89-109.
@article{4fd27673605742b7bfed3873ddbfcedc,
title = "A bottom-up approach to sentence ordering for multi-document summarization",
abstract = "Ordering information is a difficult but important task for applications generating natural language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected from a set of source documents. However, improper ordering of information in a summary can confuse the reader and deteriorate the readability of the summary. Therefore, it is vital to properly order the information in multi-document summarization. We present a bottom-up approach to arrange sentences extracted for multi-document summarization. To capture the association and order of two textual segments (e.g. sentences), we define four criteria: chronology, topical-closeness, precedence, and succession. These criteria are integrated into a criterion by a supervised learning approach. We repeatedly concatenate two textual segments into one segment based on the criterion, until we obtain the overall segment with all sentences arranged. We evaluate the sentence orderings produced by the proposed method and numerous baselines using subjective gradings as well as automatic evaluation measures. We introduce the average continuity, an automatic evaluation measure of sentence ordering in a summary, and investigate its appropriateness for this task.",
keywords = "Multi-document summarization, Natural language processing, Sentence ordering",
author = "Danushka Bollegala and Naoaki Okazaki and Mitsuru Ishizuka",
year = "2010",
month = "1",
doi = "10.1016/j.ipm.2009.07.004",
language = "English",
volume = "46",
pages = "89--109",
journal = "Information Processing and Management",
issn = "0306-4573",
publisher = "Elsevier Limited",
number = "1",

}

TY - JOUR

T1 - A bottom-up approach to sentence ordering for multi-document summarization

AU - Bollegala, Danushka

AU - Okazaki, Naoaki

AU - Ishizuka, Mitsuru

PY - 2010/1

Y1 - 2010/1

N2 - Ordering information is a difficult but important task for applications generating natural language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected from a set of source documents. However, improper ordering of information in a summary can confuse the reader and deteriorate the readability of the summary. Therefore, it is vital to properly order the information in multi-document summarization. We present a bottom-up approach to arrange sentences extracted for multi-document summarization. To capture the association and order of two textual segments (e.g. sentences), we define four criteria: chronology, topical-closeness, precedence, and succession. These criteria are integrated into a criterion by a supervised learning approach. We repeatedly concatenate two textual segments into one segment based on the criterion, until we obtain the overall segment with all sentences arranged. We evaluate the sentence orderings produced by the proposed method and numerous baselines using subjective gradings as well as automatic evaluation measures. We introduce the average continuity, an automatic evaluation measure of sentence ordering in a summary, and investigate its appropriateness for this task.

AB - Ordering information is a difficult but important task for applications generating natural language texts such as multi-document summarization, question answering, and concept-to-text generation. In multi-document summarization, information is selected from a set of source documents. However, improper ordering of information in a summary can confuse the reader and deteriorate the readability of the summary. Therefore, it is vital to properly order the information in multi-document summarization. We present a bottom-up approach to arrange sentences extracted for multi-document summarization. To capture the association and order of two textual segments (e.g. sentences), we define four criteria: chronology, topical-closeness, precedence, and succession. These criteria are integrated into a criterion by a supervised learning approach. We repeatedly concatenate two textual segments into one segment based on the criterion, until we obtain the overall segment with all sentences arranged. We evaluate the sentence orderings produced by the proposed method and numerous baselines using subjective gradings as well as automatic evaluation measures. We introduce the average continuity, an automatic evaluation measure of sentence ordering in a summary, and investigate its appropriateness for this task.

KW - Multi-document summarization

KW - Natural language processing

KW - Sentence ordering

UR - http://www.scopus.com/inward/record.url?scp=70349843318&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349843318&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2009.07.004

DO - 10.1016/j.ipm.2009.07.004

M3 - Article

AN - SCOPUS:70349843318

VL - 46

SP - 89

EP - 109

JO - Information Processing and Management

JF - Information Processing and Management

SN - 0306-4573

IS - 1

ER -