Low-cost, bottom-up measures for evaluating search result diversification

Zhicheng Dou*, Xue Yang, Diya Li, Ji Rong Wen, Tetsuya Sakai

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


Search result diversification aims at covering different user intents by returning a diversified document list. Most existing diversity measures require a predefined set of intents for a given query, where it is assumed that there is no relationship across these intents. However, studies have shown that modeling a hierarchy of intents has some benefits over the standard measure of using a flat list of intents. Intuitively, having more layers in the intent hierarchy seems to imply that we can consider more intricate relationships between intents and thereby identify subtle differences between documents that cover different intents. On the other hand, manually building a rich intent hierarchy imposes extra cost and is probably not very practical. In light of these considerations, we first propose a measure to build a hierarchy of intents from a given set of flat intents by clustering per-intent relevant documents and thereby identifying subintents. Furthermore, in our second measure, we consider a variant of our first measure that clusters per-topic relevance documents rather than per-intent ones, which is also intent-free. In addition, we propose our third measure, a simple, completely intent-free measure to search result diversity evaluation, which leverages document similarities. Our experiments based on TREC Web Track 2009–2013 test collections show that our proposed measures have advantages over existing diversity measures despite their low annotation costs.

Original languageEnglish
Pages (from-to)86-113
Number of pages28
JournalInformation Retrieval Journal
Issue number1
Publication statusPublished - 2020 Feb 1


  • Evaluation measure
  • Hierarchical clustering
  • Search result diversification

ASJC Scopus subject areas

  • Information Systems
  • Library and Information Sciences


Dive into the research topics of 'Low-cost, bottom-up measures for evaluating search result diversification'. Together they form a unique fingerprint.

Cite this