Human gesture analysis using multimodal features

Luo Dan, Hazim Kemal Ekenel, Jun Ohya

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    4 Citations (Scopus)

    Abstract

    Human gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and PLS is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.

    Original languageEnglish
    Title of host publicationProceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012
    Pages471-476
    Number of pages6
    DOIs
    Publication statusPublished - 2012
    Event2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012 - Melbourne, VIC
    Duration: 2012 Jul 92012 Jul 13

    Other

    Other2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012
    CityMelbourne, VIC
    Period12/7/912/7/13

    Fingerprint

    Gesture recognition
    Fusion reactions
    Human computer interaction
    Condensation
    Cameras
    Experiments

    Keywords

    • Condensation Algorithm
    • Facial Expression
    • Gesture Recognition

    ASJC Scopus subject areas

    • Computer Graphics and Computer-Aided Design
    • Computer Vision and Pattern Recognition
    • Human-Computer Interaction

    Cite this

    Dan, L., Ekenel, H. K., & Ohya, J. (2012). Human gesture analysis using multimodal features. In Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012 (pp. 471-476). [6266429] https://doi.org/10.1109/ICMEW.2012.88

    Human gesture analysis using multimodal features. / Dan, Luo; Ekenel, Hazim Kemal; Ohya, Jun.

    Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012. 2012. p. 471-476 6266429.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Dan, L, Ekenel, HK & Ohya, J 2012, Human gesture analysis using multimodal features. in Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012., 6266429, pp. 471-476, 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012, Melbourne, VIC, 12/7/9. https://doi.org/10.1109/ICMEW.2012.88
    Dan L, Ekenel HK, Ohya J. Human gesture analysis using multimodal features. In Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012. 2012. p. 471-476. 6266429 https://doi.org/10.1109/ICMEW.2012.88
    Dan, Luo ; Ekenel, Hazim Kemal ; Ohya, Jun. / Human gesture analysis using multimodal features. Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012. 2012. pp. 471-476
    @inproceedings{6a45162de966421892c2c72ef6f13677,
    title = "Human gesture analysis using multimodal features",
    abstract = "Human gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and PLS is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.",
    keywords = "Condensation Algorithm, Facial Expression, Gesture Recognition",
    author = "Luo Dan and Ekenel, {Hazim Kemal} and Jun Ohya",
    year = "2012",
    doi = "10.1109/ICMEW.2012.88",
    language = "English",
    isbn = "9780769547299",
    pages = "471--476",
    booktitle = "Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012",

    }

    TY - GEN

    T1 - Human gesture analysis using multimodal features

    AU - Dan, Luo

    AU - Ekenel, Hazim Kemal

    AU - Ohya, Jun

    PY - 2012

    Y1 - 2012

    N2 - Human gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and PLS is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.

    AB - Human gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and PLS is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.

    KW - Condensation Algorithm

    KW - Facial Expression

    KW - Gesture Recognition

    UR - http://www.scopus.com/inward/record.url?scp=84866843771&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84866843771&partnerID=8YFLogxK

    U2 - 10.1109/ICMEW.2012.88

    DO - 10.1109/ICMEW.2012.88

    M3 - Conference contribution

    AN - SCOPUS:84866843771

    SN - 9780769547299

    SP - 471

    EP - 476

    BT - Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012

    ER -