Evaluating Relevance Judgments with Pairwise Discriminative Power

Zhumin Chu, Jiaxin Mao, Fan Zhang, Yiqun Liu*, Tetsuya Sakai, Min Zhang, Shaoping Ma

*この研究の対応する著者

研究成果

1 被引用数 (Scopus)

抄録

Relevance judgments play an essential role in the evaluation of information retrieval systems. As many different relevance judgment settings have been proposed in recent years, an evaluation metric to compare relevance judgments in different annotation settings has become a necessity. Traditional metrics, such as , Krippendorff's α and φ have mainly focused on the inter-assessor consistency to evaluate the quality of relevance judgments. They encounter "reliable but useless"problem when employed to compare different annotation settings (e.g. binary judgment v.s. 4-grade judgment). Meanwhile, other existing popular metrics such as discriminative power (DP) are not designed to compare relevance judgments across different annotation settings, they therefore suffer from limitations, such as the requirement of result ranking lists from different systems. Therefore, how to design an evaluation metric to compare relevance judgments under different grade settings needs further investigation. In this work, we propose a novel metric named pairwise discriminative power (PDP) to evaluate the quality of relevance judgment collections. By leveraging a small amount of document-level preference tests, PDP estimates the discriminative ability of relevance judgments on separating ranking lists with various qualities. With comprehensive experiments on both synthetic and real-world datasets, we show that PDP maintains a high degree of consistency with annotation quality in various grade settings. Compared with existing metrics (e.g., Krippendorff's α, φ, DP, etc), it provides reliable evaluation results with affordable additional annotation efforts.

本文言語English
ホスト出版物のタイトルCIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management
出版社Association for Computing Machinery
ページ261-270
ページ数10
ISBN(電子版)9781450384469
DOI
出版ステータスPublished - 2021 10月 26
イベント30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, Australia
継続期間: 2021 11月 12021 11月 5

出版物シリーズ

名前International Conference on Information and Knowledge Management, Proceedings

Conference

Conference30th ACM International Conference on Information and Knowledge Management, CIKM 2021
国/地域Australia
CityVirtual, Online
Period21/11/121/11/5

ASJC Scopus subject areas

  • ビジネス、管理および会計(全般)
  • 決定科学(全般)

フィンガープリント

「Evaluating Relevance Judgments with Pairwise Discriminative Power」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル