Topic set size design for paired and unpaired data

研究成果: Conference contribution

抄録

Topic set size design is an approach to determining the sample sizes of an experiment (e.g., number of topics) based on a statistical requirement, namely a desired statistical power or a cap on the confidence interval (CI) width for the difference in means. Previous work considered paired data cases for a desired power of the t - test and for a cap on CI width, as well as unpaired data cases for a desired power of one-way ANOVA. In the present study, we consider unpaired (i.e., two-sample) cases for the t -test and for the CI width. Since one-way ANOVA with two groups is strictly equivalent to the two-sample t -test, we compare the outcomes of the topic set size design results based on these two approaches, and show that the one-way ANOVA-based approach actually returns tighter sample sizes than the two-sample t -test approach. Moreover, we compare the paired and unpaired cases for both t-test-based and CI-based topic set size design approaches. Because estimating the variance of the score differences for the paired data setting is problematic, we recommend the use of our unpaired-data versions of t-test-based and CI-based topic set size design tools, as they only require a variance estimate for individual scores and the appropriate sample sizes for unpaired data are also large enough for paired data.

本文言語English
ホスト出版物のタイトルICTIR 2018 - Proceedings of the 2018 ACM SIGIR International Conference on the Theory of Information Retrieval
出版社Association for Computing Machinery, Inc
ページ199-202
ページ数4
ISBN(電子版)9781450356565
DOI
出版ステータスPublished - 2018 9 10
イベント8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2018 - Tianjin, China
継続期間: 2018 9 142018 9 17

出版物シリーズ

名前ICTIR 2018 - Proceedings of the 2018 ACM SIGIR International Conference on the Theory of Information Retrieval

Conference

Conference8th ACM SIGIR International Conference on the Theory of Information Retrieval, ICTIR 2018
CountryChina
CityTianjin
Period18/9/1418/9/17

ASJC Scopus subject areas

  • Information Systems
  • Computer Science (miscellaneous)

フィンガープリント 「Topic set size design for paired and unpaired data」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル