Multi-Modal Fashion Product Retrieval

A. Rubio, Long Long Yu, E. Simo-Serra, F. Moreno-Noguer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Finding a product in the fashion world can be a daunting task. Everyday, e-commerce sites are updating with thousands of images and their associated metadata (textual information), deepening the problem. In this paper, we leverage both the images and textual metadata and propose a joint multi-modal embedding that maps both the text and images into a common latent space. Distances in the latent space correspond to similarity between products, allowing us to effectively perform retrieval in this latent space. We compare against existing approaches and show significant improvements in retrieval tasks on a large-scale e-commerce dataset.

Original languageEnglish
Title of host publicationVL 2017 - 6th Workshop on Vision and Language, Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages43-45
Number of pages3
ISBN (Electronic)9781945626517
Publication statusPublished - 2017
Event6th Workshop on Vision and Language, VL 2017 as part of EACL 2017 - Valencia, Spain
Duration: 2017 Apr 4 → …

Publication series

NameVL 2017 - 6th Workshop on Vision and Language, Proceedings of the Workshop

Conference

Conference6th Workshop on Vision and Language, VL 2017 as part of EACL 2017
Country/TerritorySpain
CityValencia
Period17/4/4 → …

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Multi-Modal Fashion Product Retrieval'. Together they form a unique fingerprint.

Cite this