Ranking the NTCIR systems based on multigrade relevance

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)


At NTCIR-4, new retrieval effectiveness metrics called Q-measure and R-measure were proposed for evaluation based on multigrade relevance. This paper shows that Q-measure inherits both the reliability of noninterpolated Average Precision and the multigrade relevance capability of Average Weighted Precision through a theoretical analysis, and then verify the above claim through experiments by actually ranking the systems submitted to the NTCIR-3 CLIR Task. Our experiments confirm that the Q-measure ranking is very highly correlated with the Average Precision ranking and that it is more reliable than Average Weighted Precision.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science
EditorsS.H. Myaeng, M. Zhou, H.J. Zhang, K.-F. Wong
Number of pages12
Publication statusPublished - 2005
Externally publishedYes
EventAsia Information Retrieval Symposium, AIRS 2004 - Beijing, China
Duration: 2004 Oct 182004 Oct 20


OtherAsia Information Retrieval Symposium, AIRS 2004


ASJC Scopus subject areas

  • Computer Science (miscellaneous)

Cite this

Sakai, T. (2005). Ranking the NTCIR systems based on multigrade relevance. In S. H. Myaeng, M. Zhou, H. J. Zhang, & K-F. Wong (Eds.), Lecture Notes in Computer Science (Vol. 3411, pp. 251-262)