Generating Short Product Descriptors Based on Very Little Training Data

Peng Xiao, Joo Young Lee, Sijie Tao, Young Sook Hwang, Tetsuya Sakai*

*この研究の対応する著者

研究成果: Conference contribution

抄録

We propose a pipeline model for summarising a short textual product description for inclusion in an online advertisement banner. While a standard approach is to truncate the advertiser’s original product description so that the text will fit the small banner, this simplistic approach often removes crucial information or attractive expressions from the original description. Our objective is to shorten the original description more intelligently, so that users’ click through rate (CTR) will improve. One major difficulty in this task, however, is the lack of large training data: machine learning methods that rely on thousands of pairs of the original and shortened texts would not be practical. Hence, our proposed method first employs a semisupervised sequence tagging method called TagLM to convert the original description into a sequence of entities, and then a BiLSTM entity ranker which determines which entities should be preserved: the main idea is to tackle the data sparsity problem by leveraging sequences of entities rather than sequences of words. In our offline experiments with Korean data from travel and fashion domains, our sequence tagger outperforms an LSTM-CRF baseline, and our entity ranker outperforms LambdaMART and RandomForest baselines. More importantly, in our online A/B testing where the proposed method was compared to the simple truncation approach, the CTR improved by 34.1% in the desktop PC environment.

本文言語English
ホスト出版物のタイトルInformation Retrieval Technology - 15th Asia Information Retrieval Societies Conference, AIRS 2019, Proceedings
編集者Fu Lee Wang, Haoran Xie, Wai Lam, Aixin Sun, Lun-Wei Ku, Tianyong Hao, Wei Chen, Tak-Lam Wong, Xiaohui Tao
出版社Springer
ページ133-144
ページ数12
ISBN(印刷版)9783030428341
DOI
出版ステータスPublished - 2020
イベント15th Asia Information Retrieval Societies Conference, AIRS 2019 - Kowloon, Hong Kong
継続期間: 2019 11月 72019 11月 9

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
12004 LNCS
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference15th Asia Information Retrieval Societies Conference, AIRS 2019
国/地域Hong Kong
CityKowloon
Period19/11/719/11/9

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Generating Short Product Descriptors Based on Very Little Training Data」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル