Statistical significance, power, and sample sizes: A systematic review of SIGIR and TOIS, 2006-2015

    研究成果: Conference contribution

    20 被引用数 (Scopus)

    抄録

    We conducted a systematic review of 840 SIGIR full papers and 215 TOIS papers published between 2006 and 2015. The original objective of the study was to identify IR effectiveness experiments that are seriously underpowered (i.e., the sample size is far too small so that the probability of missing a real difference is extremely high) or overpowered (i.e., the sample size is so large that a difference will be considered statistically significant even if the actual effect size is extremely small). However, it quickly became clear to us that many IR effectiveness papers either lack significance testing or fail to report p-values and/or test statistics, which prevents us from conducting power analysis. Hence we first report on how IR researchers (fail to) report on significance test results, what types of tests they use, and how the reporting practices may have changed over the last decade. From those papers that reported enough information for us to conduct power analysis, we identify extremely overpowered and underpowered experiments, as well as appropriate sample sizes for future experiments. The raw results of our systematic survey of 1,055 papers and our R scripts for power analysis are available online. Our hope is that this study will help improve the reporting practices and experimental designs of future IR effectiveness studies.

    本文言語English
    ホスト出版物のタイトルSIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval
    出版社Association for Computing Machinery, Inc
    ページ5-14
    ページ数10
    ISBN(電子版)9781450342902
    DOI
    出版ステータスPublished - 2016 7 7
    イベント39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016 - Pisa, Italy
    継続期間: 2016 7 172016 7 21

    Other

    Other39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016
    CountryItaly
    CityPisa
    Period16/7/1716/7/21

    ASJC Scopus subject areas

    • Information Systems
    • Software

    フィンガープリント 「Statistical significance, power, and sample sizes: A systematic review of SIGIR and TOIS, 2006-2015」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

    引用スタイル