LoL-V2T: Large-scale esports video description dataset

Tsunehiko Tanaka, Edgar Simo-Serra

研究成果: Conference contribution

抄録

Esports is a fastest-growing new field with a largely online-presence, and is creating a demand for automatic domain-specific captioning tools. However, at the current time, there are few approaches that tackle the esports video description problem. In this work, we propose a large-scale dataset for esports video description, focusing on the popular game "League of Legends". The dataset, which we call LoL-V2T, is the largest video description dataset in the video game domain, and includes 9, 723 clips with 62, 677 captions. This new dataset presents multiple new video captioning challenges such as large amounts of domain-specific vocabulary, subtle motions with large importance, and a temporal gap between most captions and the events that occurred. In order to tackle the issue of vocabulary, we propose a masking the domain-specific words and provide additional annotations for this. In our results, we show that the dataset poses a challenge to existing video captioning approaches, and the masking can significantly improve performance. Our dataset and code is publicly available1.

本文言語English
ホスト出版物のタイトルProceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021
出版社IEEE Computer Society
ページ4552-4561
ページ数10
ISBN(電子版)9781665448994
DOI
出版ステータスPublished - 2021 6
イベント2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021 - Virtual, Online, United States
継続期間: 2021 6 192021 6 25

出版物シリーズ

名前IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN(印刷版)2160-7508
ISSN(電子版)2160-7516

Conference

Conference2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2021
国/地域United States
CityVirtual, Online
Period21/6/1921/6/25

ASJC Scopus subject areas

  • コンピュータ ビジョンおよびパターン認識
  • 電子工学および電気工学

引用スタイル