Multi-modal joint embedding for fashion product retrieval

A. Rubio, Longlong Yu, E. Simo-Serra, F. Moreno-Noguer

研究成果: Conference contribution

3 被引用数 (Scopus)

抄録

Finding a product in the fashion world can be a daunting task. Everyday, e-commerce sites are updating with thousands of images and their associated metadata (textual information), deepening the problem, akin to finding a needle in a haystack. In this paper, we leverage both the images and textual metadata and propose a joint multi-modal embedding that maps both the text and images into a common latent space. Distances in the latent space correspond to similarity between products, allowing us to effectively perform retrieval in this latent space, which is both efficient and accurate. We train this embedding using large-scale real world e-commerce data by both minimizing the similarity between related products and using auxiliary classification networks to that encourage the embedding to have semantic meaning. We compare against existing approaches and show significant improvements in retrieval tasks on a large-scale e-commerce dataset. We also provide an analysis of the different metadata.

本文言語English
ホスト出版物のタイトル2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings
出版社IEEE Computer Society
ページ400-404
ページ数5
ISBN(電子版)9781509021758
DOI
出版ステータスPublished - 2018 2 20
イベント24th IEEE International Conference on Image Processing, ICIP 2017 - Beijing, China
継続期間: 2017 9 172017 9 20

出版物シリーズ

名前Proceedings - International Conference on Image Processing, ICIP
2017-September
ISSN(印刷版)1522-4880

Other

Other24th IEEE International Conference on Image Processing, ICIP 2017
国/地域China
CityBeijing
Period17/9/1717/9/20

ASJC Scopus subject areas

  • ソフトウェア
  • コンピュータ ビジョンおよびパターン認識
  • 信号処理

フィンガープリント

「Multi-modal joint embedding for fashion product retrieval」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル