Unsupervised Answer Retrieval with Data Fusion for Community Question Answering

Sosuke Kato, Toru Shimizu, Sumio Fujita, Tetsuya Sakai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Community question answering (cQA) systems have enjoyed the benefits of advances in neural information retrieval, some models of which need annotated documents as supervised data. However, in contrast with the amount of supervised data for cQA systems, user-generated data in cQA sites have been increasing greatly with time. Thus, focusing on unsupervised models, we tackle a task of retrieving relevant answers for new questions from existing cQA data and propose two frameworks to exploit a Question Retrieval (QR) model for Answer Retrieval (AR). The first framework ranks answers according to the combined scores of QR and AR models and the second framework ranks answers using the scores of a QR model and best answer flags. In our experiments, we applied the combination of our proposed frameworks and a classical fusion technique to AR models with a Japanese cQA data set containing approximately 9.4M question-answer pairs. When best answer flags in the cQA data cannot be utilized, our combination of AR and QR scores with data fusion outperforms a base AR model on average. When best answer flags can be utilized, the retrieval performance can be improved further. While our results lack statistical significance, we discuss effect sizes as well as future sample sizes to attain sufficient statistical power.

Original languageEnglish
Title of host publicationInformation Retrieval Technology - 15th Asia Information Retrieval Societies Conference, AIRS 2019, Proceedings
EditorsFu Lee Wang, Haoran Xie, Wai Lam, Aixin Sun, Lun-Wei Ku, Tianyong Hao, Wei Chen, Tak-Lam Wong, Xiaohui Tao
PublisherSpringer
Pages10-21
Number of pages12
ISBN (Print)9783030428341
DOIs
Publication statusPublished - 2020
Event15th Asia Information Retrieval Societies Conference, AIRS 2019 - Kowloon, Hong Kong
Duration: 2019 Nov 72019 Nov 9

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12004 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th Asia Information Retrieval Societies Conference, AIRS 2019
CountryHong Kong
CityKowloon
Period19/11/719/11/9

Keywords

  • Answer Retrieval
  • Community question answering
  • Data fusion
  • Question Retrieval
  • Unsupervised model

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Unsupervised Answer Retrieval with Data Fusion for Community Question Answering'. Together they form a unique fingerprint.

Cite this