Abstract
In this paper, we report our efforts and challenges on the TRECVID ad-hoc video search (AVS) task. The goal of the AVS task it to build a zero-shot video retrieval system using a complicated query phrase. Our system has the following two characteristics. First, we prepared a large number of pre-trained concept classifiers in advance that can detect various kinds of objects, persons, scenes, and actions. This strategy contributes to improve the word coverage rate of keywords in query phrases. Second, we selected additional concept classifiers by natural language processing techniques such as using word similarities or synonyms. We submitted our systems with these two characteristics to the TRECVID AVS task in 2016 and 2017, and one of our systems ranked the highest among all the submitted systems for the second consecutive year.
Original language | English |
---|---|
Pages (from-to) | 983-990 |
Number of pages | 8 |
Journal | Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering |
Volume | 84 |
Issue number | 12 |
DOIs | |
Publication status | Published - 2018 Jan 1 |
Externally published | Yes |
Keywords
- Ad-hoc video search
- Convolutional neural network
- TRECVID
- Video retrieval
- Zero-shot learning
ASJC Scopus subject areas
- Mechanical Engineering