Appearance-based human gesture recognition using multimodal features for human computer interaction

Dan Luo, Hua Gao, Hazim Kemal Ekenel, Jun Ohya

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.

    Original languageEnglish
    Title of host publicationProceedings of SPIE - The International Society for Optical Engineering
    Volume7865
    DOIs
    Publication statusPublished - 2011
    EventHuman Vision and Electronic Imaging XVI - San Francisco, CA
    Duration: 2011 Jan 242011 Jan 27

    Other

    OtherHuman Vision and Electronic Imaging XVI
    CitySan Francisco, CA
    Period11/1/2411/1/27

    Fingerprint

    Gesture recognition
    Gesture Recognition
    Human computer interaction
    Gesture
    Facial Expression
    Fusion reactions
    Fusion
    Interaction
    fusion
    interactions
    Hand Gesture Recognition
    Sign Language
    Motion
    Condensation
    torso
    Cameras
    Modality
    Weighting
    Choose
    Camera

    Keywords

    • Condensation Algorithm
    • Facial Expression
    • Gesture Recognition

    ASJC Scopus subject areas

    • Applied Mathematics
    • Computer Science Applications
    • Electrical and Electronic Engineering
    • Electronic, Optical and Magnetic Materials
    • Condensed Matter Physics

    Cite this

    Luo, D., Gao, H., Ekenel, H. K., & Ohya, J. (2011). Appearance-based human gesture recognition using multimodal features for human computer interaction. In Proceedings of SPIE - The International Society for Optical Engineering (Vol. 7865). [786509] https://doi.org/10.1117/12.872525

    Appearance-based human gesture recognition using multimodal features for human computer interaction. / Luo, Dan; Gao, Hua; Ekenel, Hazim Kemal; Ohya, Jun.

    Proceedings of SPIE - The International Society for Optical Engineering. Vol. 7865 2011. 786509.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Luo, D, Gao, H, Ekenel, HK & Ohya, J 2011, Appearance-based human gesture recognition using multimodal features for human computer interaction. in Proceedings of SPIE - The International Society for Optical Engineering. vol. 7865, 786509, Human Vision and Electronic Imaging XVI, San Francisco, CA, 11/1/24. https://doi.org/10.1117/12.872525
    Luo D, Gao H, Ekenel HK, Ohya J. Appearance-based human gesture recognition using multimodal features for human computer interaction. In Proceedings of SPIE - The International Society for Optical Engineering. Vol. 7865. 2011. 786509 https://doi.org/10.1117/12.872525
    Luo, Dan ; Gao, Hua ; Ekenel, Hazim Kemal ; Ohya, Jun. / Appearance-based human gesture recognition using multimodal features for human computer interaction. Proceedings of SPIE - The International Society for Optical Engineering. Vol. 7865 2011.
    @inproceedings{42383e51b2574f27be93751a2e2287f4,
    title = "Appearance-based human gesture recognition using multimodal features for human computer interaction",
    abstract = "The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.",
    keywords = "Condensation Algorithm, Facial Expression, Gesture Recognition",
    author = "Dan Luo and Hua Gao and Ekenel, {Hazim Kemal} and Jun Ohya",
    year = "2011",
    doi = "10.1117/12.872525",
    language = "English",
    isbn = "9780819484024",
    volume = "7865",
    booktitle = "Proceedings of SPIE - The International Society for Optical Engineering",

    }

    TY - GEN

    T1 - Appearance-based human gesture recognition using multimodal features for human computer interaction

    AU - Luo, Dan

    AU - Gao, Hua

    AU - Ekenel, Hazim Kemal

    AU - Ohya, Jun

    PY - 2011

    Y1 - 2011

    N2 - The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.

    AB - The use of gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and LDA is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.

    KW - Condensation Algorithm

    KW - Facial Expression

    KW - Gesture Recognition

    UR - http://www.scopus.com/inward/record.url?scp=79953719719&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=79953719719&partnerID=8YFLogxK

    U2 - 10.1117/12.872525

    DO - 10.1117/12.872525

    M3 - Conference contribution

    AN - SCOPUS:79953719719

    SN - 9780819484024

    VL - 7865

    BT - Proceedings of SPIE - The International Society for Optical Engineering

    ER -