On the reliability and intuitiveness of aggregated search metrics

Ke Zhou, Mounia Lalmas, Tetsuya Sakai, Ronan Cummins, Joemon M. Jose

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    7 Citations (Scopus)

    Abstract

    Aggregating search results from a variety of diverse verticals such as news, images, videos and Wikipedia into a single interface is a popular web search presentation paradigm. Although several aggregated search (AS) metrics have been proposed to evaluate AS result pages, their properties remain poorly understood. In this paper, we compare the properties of existing AS metrics under the assumptions that (1) queries may have multiple preferred verticals; (2) the likelihood of each vertical preference is available; and (3) the topical relevance assessments of results returned from each vertical is available. We compare a wide range of AS metrics on two test collections. Our main criteria of comparison are (1) discriminative power, which represents the reliability of a metric in comparing the performance of systems, and (2) intuitiveness, which represents how well a metric captures the various key aspects to be measured (i.e. various aspects of a user's perception of AS result pages). Our study shows that the AS metrics that capture key AS components (e.g., vertical selection) have several advantages over other metrics. This work sheds new lights on the further developments and applications of AS metrics.

    Original languageEnglish
    Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
    Pages689-698
    Number of pages10
    DOIs
    Publication statusPublished - 2013
    Event22nd ACM International Conference on Information and Knowledge Management, CIKM 2013 - San Francisco, CA
    Duration: 2013 Oct 272013 Nov 1

    Other

    Other22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
    CitySan Francisco, CA
    Period13/10/2713/11/1

    Fingerprint

    Web search
    Paradigm
    Wikipedia
    Query
    Test collections
    News

    Keywords

    • Aggregated search
    • Discriminative power
    • Diversity
    • Evaluation
    • Intuitiveness
    • Metric
    • Reliability

    ASJC Scopus subject areas

    • Business, Management and Accounting(all)
    • Decision Sciences(all)

    Cite this

    Zhou, K., Lalmas, M., Sakai, T., Cummins, R., & Jose, J. M. (2013). On the reliability and intuitiveness of aggregated search metrics. In International Conference on Information and Knowledge Management, Proceedings (pp. 689-698) https://doi.org/10.1145/2505515.2505691

    On the reliability and intuitiveness of aggregated search metrics. / Zhou, Ke; Lalmas, Mounia; Sakai, Tetsuya; Cummins, Ronan; Jose, Joemon M.

    International Conference on Information and Knowledge Management, Proceedings. 2013. p. 689-698.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Zhou, K, Lalmas, M, Sakai, T, Cummins, R & Jose, JM 2013, On the reliability and intuitiveness of aggregated search metrics. in International Conference on Information and Knowledge Management, Proceedings. pp. 689-698, 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013, San Francisco, CA, 13/10/27. https://doi.org/10.1145/2505515.2505691
    Zhou K, Lalmas M, Sakai T, Cummins R, Jose JM. On the reliability and intuitiveness of aggregated search metrics. In International Conference on Information and Knowledge Management, Proceedings. 2013. p. 689-698 https://doi.org/10.1145/2505515.2505691
    Zhou, Ke ; Lalmas, Mounia ; Sakai, Tetsuya ; Cummins, Ronan ; Jose, Joemon M. / On the reliability and intuitiveness of aggregated search metrics. International Conference on Information and Knowledge Management, Proceedings. 2013. pp. 689-698
    @inproceedings{8d2ab6ce10934b8083ecd11329d44b4d,
    title = "On the reliability and intuitiveness of aggregated search metrics",
    abstract = "Aggregating search results from a variety of diverse verticals such as news, images, videos and Wikipedia into a single interface is a popular web search presentation paradigm. Although several aggregated search (AS) metrics have been proposed to evaluate AS result pages, their properties remain poorly understood. In this paper, we compare the properties of existing AS metrics under the assumptions that (1) queries may have multiple preferred verticals; (2) the likelihood of each vertical preference is available; and (3) the topical relevance assessments of results returned from each vertical is available. We compare a wide range of AS metrics on two test collections. Our main criteria of comparison are (1) discriminative power, which represents the reliability of a metric in comparing the performance of systems, and (2) intuitiveness, which represents how well a metric captures the various key aspects to be measured (i.e. various aspects of a user's perception of AS result pages). Our study shows that the AS metrics that capture key AS components (e.g., vertical selection) have several advantages over other metrics. This work sheds new lights on the further developments and applications of AS metrics.",
    keywords = "Aggregated search, Discriminative power, Diversity, Evaluation, Intuitiveness, Metric, Reliability",
    author = "Ke Zhou and Mounia Lalmas and Tetsuya Sakai and Ronan Cummins and Jose, {Joemon M.}",
    year = "2013",
    doi = "10.1145/2505515.2505691",
    language = "English",
    isbn = "9781450322638",
    pages = "689--698",
    booktitle = "International Conference on Information and Knowledge Management, Proceedings",

    }

    TY - GEN

    T1 - On the reliability and intuitiveness of aggregated search metrics

    AU - Zhou, Ke

    AU - Lalmas, Mounia

    AU - Sakai, Tetsuya

    AU - Cummins, Ronan

    AU - Jose, Joemon M.

    PY - 2013

    Y1 - 2013

    N2 - Aggregating search results from a variety of diverse verticals such as news, images, videos and Wikipedia into a single interface is a popular web search presentation paradigm. Although several aggregated search (AS) metrics have been proposed to evaluate AS result pages, their properties remain poorly understood. In this paper, we compare the properties of existing AS metrics under the assumptions that (1) queries may have multiple preferred verticals; (2) the likelihood of each vertical preference is available; and (3) the topical relevance assessments of results returned from each vertical is available. We compare a wide range of AS metrics on two test collections. Our main criteria of comparison are (1) discriminative power, which represents the reliability of a metric in comparing the performance of systems, and (2) intuitiveness, which represents how well a metric captures the various key aspects to be measured (i.e. various aspects of a user's perception of AS result pages). Our study shows that the AS metrics that capture key AS components (e.g., vertical selection) have several advantages over other metrics. This work sheds new lights on the further developments and applications of AS metrics.

    AB - Aggregating search results from a variety of diverse verticals such as news, images, videos and Wikipedia into a single interface is a popular web search presentation paradigm. Although several aggregated search (AS) metrics have been proposed to evaluate AS result pages, their properties remain poorly understood. In this paper, we compare the properties of existing AS metrics under the assumptions that (1) queries may have multiple preferred verticals; (2) the likelihood of each vertical preference is available; and (3) the topical relevance assessments of results returned from each vertical is available. We compare a wide range of AS metrics on two test collections. Our main criteria of comparison are (1) discriminative power, which represents the reliability of a metric in comparing the performance of systems, and (2) intuitiveness, which represents how well a metric captures the various key aspects to be measured (i.e. various aspects of a user's perception of AS result pages). Our study shows that the AS metrics that capture key AS components (e.g., vertical selection) have several advantages over other metrics. This work sheds new lights on the further developments and applications of AS metrics.

    KW - Aggregated search

    KW - Discriminative power

    KW - Diversity

    KW - Evaluation

    KW - Intuitiveness

    KW - Metric

    KW - Reliability

    UR - http://www.scopus.com/inward/record.url?scp=84889588155&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84889588155&partnerID=8YFLogxK

    U2 - 10.1145/2505515.2505691

    DO - 10.1145/2505515.2505691

    M3 - Conference contribution

    SN - 9781450322638

    SP - 689

    EP - 698

    BT - International Conference on Information and Knowledge Management, Proceedings

    ER -