Summaries, ranked retrieval and sessions: A unified framework for information access evaluation

Tetsuya Sakai, Zhicheng Dou

Research output: Chapter in Book/Report/Conference proceedingConference contribution

37 Citations (Scopus)

Abstract

We introduce a general information access evaluation framework that can potentially handle summaries, ranked document lists and even multi-query sessions seamlessly. Our framework first builds a trailtext which represents a concatenation of all the texts read by the user during a search session, and then computes an evaluation metric called U-measure over the trailtext. Instead of discounting the value of a retrieved piece of information based on ranks, U-measure discounts it based on its position within the trailtext. U-measure takes the document length into account just like Time-Biased Gain (TBG), and has the diminishing return property. It is therefore more realistic than rank-based metrics. Furthermore, it is arguably more flexible than TBG, as it is free from the linear traversal assumption (i.e., that the user scans the ranked list from top to bottom), and can handle information access tasks other than ad hoc retrieval. This paper demonstrates the validity and versatility of the U-measure framework. Our main conclusions are: (a) For ad hoc retrieval, U-measure is at least as reliable as TBG in terms of rank correlations with traditional metrics and discriminative power; (b) For diversified search, our diversity versions of U-measure are highly correlated with state-of-the-art diversity metrics; (c) For multi-query sessions, U-measure is highly correlated with Session nDCG; and (d) Unlike rank-based metrics such as DCG, U-measure can quantify the differences between linear and nonlinear traversals in sessions. We argue that our new framework is useful for understanding the user's search behaviour and for comparison across different information access styles (e.g. examining a direct answer vs. examining a ranked list of web pages).

Original languageEnglish
Title of host publicationSIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval
Pages473-482
Number of pages10
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013 - Dublin
Duration: 2013 Jul 282013 Aug 1

Other

Other36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013
CityDublin
Period13/7/2813/8/1

Fingerprint

Websites

Keywords

  • Diversity
  • Evaluation
  • Metrics
  • Sessions
  • Test collections

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Information Systems

Cite this

Sakai, T., & Dou, Z. (2013). Summaries, ranked retrieval and sessions: A unified framework for information access evaluation. In SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 473-482) https://doi.org/10.1145/2484028.2484031

Summaries, ranked retrieval and sessions : A unified framework for information access evaluation. / Sakai, Tetsuya; Dou, Zhicheng.

SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013. p. 473-482.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sakai, T & Dou, Z 2013, Summaries, ranked retrieval and sessions: A unified framework for information access evaluation. in SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 473-482, 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2013, Dublin, 13/7/28. https://doi.org/10.1145/2484028.2484031
Sakai T, Dou Z. Summaries, ranked retrieval and sessions: A unified framework for information access evaluation. In SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013. p. 473-482 https://doi.org/10.1145/2484028.2484031
Sakai, Tetsuya ; Dou, Zhicheng. / Summaries, ranked retrieval and sessions : A unified framework for information access evaluation. SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013. pp. 473-482
@inproceedings{9666e6443fe747128cdfd57eb7090a99,
title = "Summaries, ranked retrieval and sessions: A unified framework for information access evaluation",
abstract = "We introduce a general information access evaluation framework that can potentially handle summaries, ranked document lists and even multi-query sessions seamlessly. Our framework first builds a trailtext which represents a concatenation of all the texts read by the user during a search session, and then computes an evaluation metric called U-measure over the trailtext. Instead of discounting the value of a retrieved piece of information based on ranks, U-measure discounts it based on its position within the trailtext. U-measure takes the document length into account just like Time-Biased Gain (TBG), and has the diminishing return property. It is therefore more realistic than rank-based metrics. Furthermore, it is arguably more flexible than TBG, as it is free from the linear traversal assumption (i.e., that the user scans the ranked list from top to bottom), and can handle information access tasks other than ad hoc retrieval. This paper demonstrates the validity and versatility of the U-measure framework. Our main conclusions are: (a) For ad hoc retrieval, U-measure is at least as reliable as TBG in terms of rank correlations with traditional metrics and discriminative power; (b) For diversified search, our diversity versions of U-measure are highly correlated with state-of-the-art diversity metrics; (c) For multi-query sessions, U-measure is highly correlated with Session nDCG; and (d) Unlike rank-based metrics such as DCG, U-measure can quantify the differences between linear and nonlinear traversals in sessions. We argue that our new framework is useful for understanding the user's search behaviour and for comparison across different information access styles (e.g. examining a direct answer vs. examining a ranked list of web pages).",
keywords = "Diversity, Evaluation, Metrics, Sessions, Test collections",
author = "Tetsuya Sakai and Zhicheng Dou",
year = "2013",
doi = "10.1145/2484028.2484031",
language = "English",
isbn = "9781450320344",
pages = "473--482",
booktitle = "SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval",

}

TY - GEN

T1 - Summaries, ranked retrieval and sessions

T2 - A unified framework for information access evaluation

AU - Sakai, Tetsuya

AU - Dou, Zhicheng

PY - 2013

Y1 - 2013

N2 - We introduce a general information access evaluation framework that can potentially handle summaries, ranked document lists and even multi-query sessions seamlessly. Our framework first builds a trailtext which represents a concatenation of all the texts read by the user during a search session, and then computes an evaluation metric called U-measure over the trailtext. Instead of discounting the value of a retrieved piece of information based on ranks, U-measure discounts it based on its position within the trailtext. U-measure takes the document length into account just like Time-Biased Gain (TBG), and has the diminishing return property. It is therefore more realistic than rank-based metrics. Furthermore, it is arguably more flexible than TBG, as it is free from the linear traversal assumption (i.e., that the user scans the ranked list from top to bottom), and can handle information access tasks other than ad hoc retrieval. This paper demonstrates the validity and versatility of the U-measure framework. Our main conclusions are: (a) For ad hoc retrieval, U-measure is at least as reliable as TBG in terms of rank correlations with traditional metrics and discriminative power; (b) For diversified search, our diversity versions of U-measure are highly correlated with state-of-the-art diversity metrics; (c) For multi-query sessions, U-measure is highly correlated with Session nDCG; and (d) Unlike rank-based metrics such as DCG, U-measure can quantify the differences between linear and nonlinear traversals in sessions. We argue that our new framework is useful for understanding the user's search behaviour and for comparison across different information access styles (e.g. examining a direct answer vs. examining a ranked list of web pages).

AB - We introduce a general information access evaluation framework that can potentially handle summaries, ranked document lists and even multi-query sessions seamlessly. Our framework first builds a trailtext which represents a concatenation of all the texts read by the user during a search session, and then computes an evaluation metric called U-measure over the trailtext. Instead of discounting the value of a retrieved piece of information based on ranks, U-measure discounts it based on its position within the trailtext. U-measure takes the document length into account just like Time-Biased Gain (TBG), and has the diminishing return property. It is therefore more realistic than rank-based metrics. Furthermore, it is arguably more flexible than TBG, as it is free from the linear traversal assumption (i.e., that the user scans the ranked list from top to bottom), and can handle information access tasks other than ad hoc retrieval. This paper demonstrates the validity and versatility of the U-measure framework. Our main conclusions are: (a) For ad hoc retrieval, U-measure is at least as reliable as TBG in terms of rank correlations with traditional metrics and discriminative power; (b) For diversified search, our diversity versions of U-measure are highly correlated with state-of-the-art diversity metrics; (c) For multi-query sessions, U-measure is highly correlated with Session nDCG; and (d) Unlike rank-based metrics such as DCG, U-measure can quantify the differences between linear and nonlinear traversals in sessions. We argue that our new framework is useful for understanding the user's search behaviour and for comparison across different information access styles (e.g. examining a direct answer vs. examining a ranked list of web pages).

KW - Diversity

KW - Evaluation

KW - Metrics

KW - Sessions

KW - Test collections

UR - http://www.scopus.com/inward/record.url?scp=84883092343&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883092343&partnerID=8YFLogxK

U2 - 10.1145/2484028.2484031

DO - 10.1145/2484028.2484031

M3 - Conference contribution

AN - SCOPUS:84883092343

SN - 9781450320344

SP - 473

EP - 482

BT - SIGIR 2013 - Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval

ER -