Psycholinguistic evidence has established the complementary nature of the verbal and non-verbal aspects of human expression. We present our findings in the detection of these cues in interaction. We use the psycholinguistic device known as the 'catchment' as the locus of integration of gesture, speech and gaze components. We videotape conversation elicitation experiments in which subjects convey complex spatial plans to an interlocutor using a calibrated three-camera setup. We extract the gestural motion of both hands, gaze direction, and voiced units in the discourse and compare these with transcripts generated by expert microanalysis of the video. Our results show the complementary nature of these communicative modalities. Where there is ambiguity in the structure of one modality (such as in haplologies or owing to noise in the audio signal), other modalities provide evidence for correct segmentation.
|ジャーナル||Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition|
|出版ステータス||Published - 2000|
ASJC Scopus subject areas
- コンピュータ ビジョンおよびパターン認識