Unsupervised Answer Retrieval with Data Fusion for Community Question Answering

Sosuke Kato*, Toru Shimizu, Sumio Fujita, Tetsuya Sakai

*この研究の対応する著者

研究成果: Conference contribution

抄録

Community question answering (cQA) systems have enjoyed the benefits of advances in neural information retrieval, some models of which need annotated documents as supervised data. However, in contrast with the amount of supervised data for cQA systems, user-generated data in cQA sites have been increasing greatly with time. Thus, focusing on unsupervised models, we tackle a task of retrieving relevant answers for new questions from existing cQA data and propose two frameworks to exploit a Question Retrieval (QR) model for Answer Retrieval (AR). The first framework ranks answers according to the combined scores of QR and AR models and the second framework ranks answers using the scores of a QR model and best answer flags. In our experiments, we applied the combination of our proposed frameworks and a classical fusion technique to AR models with a Japanese cQA data set containing approximately 9.4M question-answer pairs. When best answer flags in the cQA data cannot be utilized, our combination of AR and QR scores with data fusion outperforms a base AR model on average. When best answer flags can be utilized, the retrieval performance can be improved further. While our results lack statistical significance, we discuss effect sizes as well as future sample sizes to attain sufficient statistical power.

本文言語English
ホスト出版物のタイトルInformation Retrieval Technology - 15th Asia Information Retrieval Societies Conference, AIRS 2019, Proceedings
編集者Fu Lee Wang, Haoran Xie, Wai Lam, Aixin Sun, Lun-Wei Ku, Tianyong Hao, Wei Chen, Tak-Lam Wong, Xiaohui Tao
出版社Springer
ページ10-21
ページ数12
ISBN(印刷版)9783030428341
DOI
出版ステータスPublished - 2020
イベント15th Asia Information Retrieval Societies Conference, AIRS 2019 - Kowloon, Hong Kong
継続期間: 2019 11月 72019 11月 9

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
12004 LNCS
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference15th Asia Information Retrieval Societies Conference, AIRS 2019
国/地域Hong Kong
CityKowloon
Period19/11/719/11/9

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Unsupervised Answer Retrieval with Data Fusion for Community Question Answering」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル