Generic summaries for indexing in information retrieval

Tetsuya Sakai, Karen Sparck-Jones

Research output: Contribution to journalArticle

44 Citations (Scopus)

Abstract

This paper examines the use of generic summaries for indexing in information retrieval. Our main observations are that: (1) With or without pseudo-relevance feedback, a summary index may be as effective as the corresponding fulltext index for precision-oriented search of highly relevant documents. But a reasonably sophisticated summarizer, using a compression ratio of 10-30%, is desirable for this purpose. (2) In pseudo-relevance feedback, using a summary index at initial search and a fulltext index at final search is possibly effective for precision-oriented search, regardless of relevance levels. This strategy is significantly more effective than the one using the summary index only and probably more effective than using summaries as mere term selection filters. For this strategy, the summary quality is probably not a critical factor, and a compression ratio of 5-10% appears best.

Original languageEnglish
Pages (from-to)190-198
Number of pages9
JournalSIGIR Forum (ACM Special Interest Group on Information Retrieval)
Publication statusPublished - 2001
Externally publishedYes

Fingerprint

Information retrieval
Feedback
Indexing
Compression
Pseudo-relevance feedback

ASJC Scopus subject areas

  • Hardware and Architecture
  • Management Information Systems

Cite this

Generic summaries for indexing in information retrieval. / Sakai, Tetsuya; Sparck-Jones, Karen.

In: SIGIR Forum (ACM Special Interest Group on Information Retrieval), 2001, p. 190-198.

Research output: Contribution to journalArticle

@article{dfb51a07a96440a8a06625a91ee7b6c7,
title = "Generic summaries for indexing in information retrieval",
abstract = "This paper examines the use of generic summaries for indexing in information retrieval. Our main observations are that: (1) With or without pseudo-relevance feedback, a summary index may be as effective as the corresponding fulltext index for precision-oriented search of highly relevant documents. But a reasonably sophisticated summarizer, using a compression ratio of 10-30{\%}, is desirable for this purpose. (2) In pseudo-relevance feedback, using a summary index at initial search and a fulltext index at final search is possibly effective for precision-oriented search, regardless of relevance levels. This strategy is significantly more effective than the one using the summary index only and probably more effective than using summaries as mere term selection filters. For this strategy, the summary quality is probably not a critical factor, and a compression ratio of 5-10{\%} appears best.",
author = "Tetsuya Sakai and Karen Sparck-Jones",
year = "2001",
language = "English",
pages = "190--198",
journal = "SIGIR Forum (ACM Special Interest Group on Information Retrieval)",
issn = "0163-5840",
publisher = "Association for Computing Machinery (ACM)",

}

TY - JOUR

T1 - Generic summaries for indexing in information retrieval

AU - Sakai, Tetsuya

AU - Sparck-Jones, Karen

PY - 2001

Y1 - 2001

N2 - This paper examines the use of generic summaries for indexing in information retrieval. Our main observations are that: (1) With or without pseudo-relevance feedback, a summary index may be as effective as the corresponding fulltext index for precision-oriented search of highly relevant documents. But a reasonably sophisticated summarizer, using a compression ratio of 10-30%, is desirable for this purpose. (2) In pseudo-relevance feedback, using a summary index at initial search and a fulltext index at final search is possibly effective for precision-oriented search, regardless of relevance levels. This strategy is significantly more effective than the one using the summary index only and probably more effective than using summaries as mere term selection filters. For this strategy, the summary quality is probably not a critical factor, and a compression ratio of 5-10% appears best.

AB - This paper examines the use of generic summaries for indexing in information retrieval. Our main observations are that: (1) With or without pseudo-relevance feedback, a summary index may be as effective as the corresponding fulltext index for precision-oriented search of highly relevant documents. But a reasonably sophisticated summarizer, using a compression ratio of 10-30%, is desirable for this purpose. (2) In pseudo-relevance feedback, using a summary index at initial search and a fulltext index at final search is possibly effective for precision-oriented search, regardless of relevance levels. This strategy is significantly more effective than the one using the summary index only and probably more effective than using summaries as mere term selection filters. For this strategy, the summary quality is probably not a critical factor, and a compression ratio of 5-10% appears best.

UR - http://www.scopus.com/inward/record.url?scp=0034795978&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034795978&partnerID=8YFLogxK

M3 - Article

SP - 190

EP - 198

JO - SIGIR Forum (ACM Special Interest Group on Information Retrieval)

JF - SIGIR Forum (ACM Special Interest Group on Information Retrieval)

SN - 0163-5840

ER -