Video semantic indexing using object detection-derived features

Kotaro Kikuchi, Kazuya Ueki, Tetsuji Ogawa, Tetsunori Kobayashi

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    A new feature extraction method based on object detection to achieve accurate and robust semantic indexing of videos is proposed. Local features (e.g., SIFT and HOG) and convolutional neural network (CNN)-derived features, which have been used in semantic indexing, in general are extracted from the entire image and do not explicitly represent the information of meaningful objects that contributes to the determination of semantic categories. In this case, the background region, which does not contain the meaningful objects, is unduly considered, exerting a harmful effect on the indexing performance. In the present study, an attempt was made to suppress the undesirable effects derived from the redundant background information by incorporating object detection technology into semantic indexing. In the proposed method, a combination of the meaningful objects detected in the video frame image is represented as a feature vector for verification of semantic categories. Experimental comparisons demonstrate that the proposed method facilitates the TRECVID semantic indexing task.

    Original languageEnglish
    Title of host publication2016 24th European Signal Processing Conference, EUSIPCO 2016
    PublisherEuropean Signal Processing Conference, EUSIPCO
    Pages1288-1292
    Number of pages5
    Volume2016-November
    ISBN (Electronic)9780992862657
    DOIs
    Publication statusPublished - 2016 Nov 28
    Event24th European Signal Processing Conference, EUSIPCO 2016 - Budapest, Hungary
    Duration: 2016 Aug 282016 Sep 2

    Other

    Other24th European Signal Processing Conference, EUSIPCO 2016
    CountryHungary
    CityBudapest
    Period16/8/2816/9/2

    Fingerprint

    Semantics
    Object detection
    Feature extraction
    Neural networks

    ASJC Scopus subject areas

    • Signal Processing
    • Electrical and Electronic Engineering

    Cite this

    Kikuchi, K., Ueki, K., Ogawa, T., & Kobayashi, T. (2016). Video semantic indexing using object detection-derived features. In 2016 24th European Signal Processing Conference, EUSIPCO 2016 (Vol. 2016-November, pp. 1288-1292). [7760456] European Signal Processing Conference, EUSIPCO. https://doi.org/10.1109/EUSIPCO.2016.7760456

    Video semantic indexing using object detection-derived features. / Kikuchi, Kotaro; Ueki, Kazuya; Ogawa, Tetsuji; Kobayashi, Tetsunori.

    2016 24th European Signal Processing Conference, EUSIPCO 2016. Vol. 2016-November European Signal Processing Conference, EUSIPCO, 2016. p. 1288-1292 7760456.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Kikuchi, K, Ueki, K, Ogawa, T & Kobayashi, T 2016, Video semantic indexing using object detection-derived features. in 2016 24th European Signal Processing Conference, EUSIPCO 2016. vol. 2016-November, 7760456, European Signal Processing Conference, EUSIPCO, pp. 1288-1292, 24th European Signal Processing Conference, EUSIPCO 2016, Budapest, Hungary, 16/8/28. https://doi.org/10.1109/EUSIPCO.2016.7760456
    Kikuchi K, Ueki K, Ogawa T, Kobayashi T. Video semantic indexing using object detection-derived features. In 2016 24th European Signal Processing Conference, EUSIPCO 2016. Vol. 2016-November. European Signal Processing Conference, EUSIPCO. 2016. p. 1288-1292. 7760456 https://doi.org/10.1109/EUSIPCO.2016.7760456
    Kikuchi, Kotaro ; Ueki, Kazuya ; Ogawa, Tetsuji ; Kobayashi, Tetsunori. / Video semantic indexing using object detection-derived features. 2016 24th European Signal Processing Conference, EUSIPCO 2016. Vol. 2016-November European Signal Processing Conference, EUSIPCO, 2016. pp. 1288-1292
    @inproceedings{212e673cdc3044b39cec7d4a4cd03881,
    title = "Video semantic indexing using object detection-derived features",
    abstract = "A new feature extraction method based on object detection to achieve accurate and robust semantic indexing of videos is proposed. Local features (e.g., SIFT and HOG) and convolutional neural network (CNN)-derived features, which have been used in semantic indexing, in general are extracted from the entire image and do not explicitly represent the information of meaningful objects that contributes to the determination of semantic categories. In this case, the background region, which does not contain the meaningful objects, is unduly considered, exerting a harmful effect on the indexing performance. In the present study, an attempt was made to suppress the undesirable effects derived from the redundant background information by incorporating object detection technology into semantic indexing. In the proposed method, a combination of the meaningful objects detected in the video frame image is represented as a feature vector for verification of semantic categories. Experimental comparisons demonstrate that the proposed method facilitates the TRECVID semantic indexing task.",
    author = "Kotaro Kikuchi and Kazuya Ueki and Tetsuji Ogawa and Tetsunori Kobayashi",
    year = "2016",
    month = "11",
    day = "28",
    doi = "10.1109/EUSIPCO.2016.7760456",
    language = "English",
    volume = "2016-November",
    pages = "1288--1292",
    booktitle = "2016 24th European Signal Processing Conference, EUSIPCO 2016",
    publisher = "European Signal Processing Conference, EUSIPCO",

    }

    TY - GEN

    T1 - Video semantic indexing using object detection-derived features

    AU - Kikuchi, Kotaro

    AU - Ueki, Kazuya

    AU - Ogawa, Tetsuji

    AU - Kobayashi, Tetsunori

    PY - 2016/11/28

    Y1 - 2016/11/28

    N2 - A new feature extraction method based on object detection to achieve accurate and robust semantic indexing of videos is proposed. Local features (e.g., SIFT and HOG) and convolutional neural network (CNN)-derived features, which have been used in semantic indexing, in general are extracted from the entire image and do not explicitly represent the information of meaningful objects that contributes to the determination of semantic categories. In this case, the background region, which does not contain the meaningful objects, is unduly considered, exerting a harmful effect on the indexing performance. In the present study, an attempt was made to suppress the undesirable effects derived from the redundant background information by incorporating object detection technology into semantic indexing. In the proposed method, a combination of the meaningful objects detected in the video frame image is represented as a feature vector for verification of semantic categories. Experimental comparisons demonstrate that the proposed method facilitates the TRECVID semantic indexing task.

    AB - A new feature extraction method based on object detection to achieve accurate and robust semantic indexing of videos is proposed. Local features (e.g., SIFT and HOG) and convolutional neural network (CNN)-derived features, which have been used in semantic indexing, in general are extracted from the entire image and do not explicitly represent the information of meaningful objects that contributes to the determination of semantic categories. In this case, the background region, which does not contain the meaningful objects, is unduly considered, exerting a harmful effect on the indexing performance. In the present study, an attempt was made to suppress the undesirable effects derived from the redundant background information by incorporating object detection technology into semantic indexing. In the proposed method, a combination of the meaningful objects detected in the video frame image is represented as a feature vector for verification of semantic categories. Experimental comparisons demonstrate that the proposed method facilitates the TRECVID semantic indexing task.

    UR - http://www.scopus.com/inward/record.url?scp=85005976115&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85005976115&partnerID=8YFLogxK

    U2 - 10.1109/EUSIPCO.2016.7760456

    DO - 10.1109/EUSIPCO.2016.7760456

    M3 - Conference contribution

    AN - SCOPUS:85005976115

    VL - 2016-November

    SP - 1288

    EP - 1292

    BT - 2016 24th European Signal Processing Conference, EUSIPCO 2016

    PB - European Signal Processing Conference, EUSIPCO

    ER -