LSTM vs. BM25 for Open-domain QA: A hands-on comparison of effectiveness and efficiency

Sosuke Kato, Riku Togashi, Hideyuki Maeda, Sumio Fujita, Tetsuya Sakai

研究成果: Conference contribution

2 被引用数 (Scopus)

抄録

Recent advances in neural networks, along with the growth of rich and diverse community question answering (cQA) data, have en-abled researchers to construct robust open-domain question an-swering (QA) systems. It is often claimed that such state-of-The-art QA systems far outperform traditional IR baselines such as BM25. However, most such studies rely on relatively small data sets, e.g., those extracted from the old TREC QA tracks. Given mas-sive training data plus a separate corpus of Q&A pairs as the tar-get knowledge source, how well would such a system really per-form? How fast would it respond? In this demonstration, we pro-vide the attendees of SIGIR 2017 an opportunity to experience a live comparison of two open-domain QA systems, one based on a long short-Term memory (LSTM) architecture with over 11 mil-lion Yahoo! Chiebukuro (i.e., Japanese Yahoo! Answers) questions and over 27.4 million answers for training, and the other based on BM25. Both systems use the same Q&A knowledge source for answer retrieval. Our core demonstration system is a pair of Japan-ese monolingual QA systems, but we leverage machine translation for letting the SIGIR attendees enter English questions and com-pare the Japanese responses from the two systems after translating them into English.

本文言語English
ホスト出版物のタイトルSIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
出版社Association for Computing Machinery, Inc
ページ1309-1312
ページ数4
ISBN(電子版)9781450350228
DOI
出版ステータスPublished - 2017 8月 7
イベント40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017 - Tokyo, Shinjuku, Japan
継続期間: 2017 8月 72017 8月 11

出版物シリーズ

名前SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval

Other

Other40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2017
国/地域Japan
CityTokyo, Shinjuku
Period17/8/717/8/11

ASJC Scopus subject areas

  • 情報システム
  • ソフトウェア
  • コンピュータ グラフィックスおよびコンピュータ支援設計

フィンガープリント

「LSTM vs. BM25 for Open-domain QA: A hands-on comparison of effectiveness and efficiency」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル