Improving semantic video indexing: Efforts in Waseda TRECVID 2015 SIN system

Kazuya Ueki, Tetsunori Kobayashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose a method for improving the performance of semantic video indexing. Our approach involves extracting features from multiple convolutional neural networks (CNNs), creating multiple classifiers, and integrating them. We employed four measures to accomplish this: (1) utilizing multiple evidences observed in each video and effectively compressing them into a fixed-length vector; (2) introducing gradient and motion features to CNNs; (3) enriching variations of the training and the testing sets; and (4) extracting features from several CNNs trained with various large-scale datasets. Using the test dataset from TRECVID's 2014 evaluation benchmark, we evaluated the performance of the proposal in terms of the mean extended inferred average precision measure. On this measure, our system's performance was 35.7, outperforming the state-of-the-art TRECVID 2014 benchmark performance of 33.2. Based on this work, our submission at TRECVID 2015 was ranked second among all submissions.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1184-1188
Number of pages5
ISBN (Electronic)9781479999880
DOIs
Publication statusPublished - 2016 May 18
Event41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China
Duration: 2016 Mar 202016 Mar 25

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2016-May
ISSN (Print)1520-6149

Other

Other41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
CountryChina
CityShanghai
Period16/3/2016/3/25

Keywords

  • CNN
  • Semantic video indexing
  • TRECVID
  • generic object recognition
  • video search

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Improving semantic video indexing: Efforts in Waseda TRECVID 2015 SIN system'. Together they form a unique fingerprint.

  • Cite this

    Ueki, K., & Kobayashi, T. (2016). Improving semantic video indexing: Efforts in Waseda TRECVID 2015 SIN system. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings (pp. 1184-1188). [7471863] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2016-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2016.7471863