TY - JOUR
T1 - Auditory-visual speech perception examined by fMRI and PET
AU - Sekiyama, Kaoru
AU - Kanno, Iwao
AU - Miura, Shuichi
AU - Sugita, Yoichi
PY - 2003/11/1
Y1 - 2003/11/1
N2 - Cross-modal binding in auditory-visual speech perception was investigated by using the McGurk effect, a phenomenon in which hearing is altered by incongruent visual mouth movements. We used functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). In each experiment, the subjects were asked to identify spoken syllables ('ba', 'da', 'ga') presented auditorily, visually, or audiovisually (incongruent stimuli). For the auditory component of the stimuli, there were two conditions of intelligibility (High versus Low) as determined by the signal-to-noise (SN) ratio. The control task was visual talker identification of still faces. In the Low intelligibility condition in which the auditory component of the speech was harder to hear, the visual influence was much stronger. Brain imaging data showed bilateral activations specific to the unimodal auditory stimuli (in the temporal cortex) and visual stimuli (in the MT/V5). For the bimodal audiovisual stimuli, activation in the left temporal cortex extended more posteriorly toward the visual-specific area in the Low intelligibility condition. The direct comparison between the Low and High audiovisual conditions showed increased activations in the posterior part of the left superior temporal sulcus (STS), indicating its relationship with the stronger visual influence. It was discussed that this region is likely to be involved in cross-modal binding of auditory-visual speech.
AB - Cross-modal binding in auditory-visual speech perception was investigated by using the McGurk effect, a phenomenon in which hearing is altered by incongruent visual mouth movements. We used functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). In each experiment, the subjects were asked to identify spoken syllables ('ba', 'da', 'ga') presented auditorily, visually, or audiovisually (incongruent stimuli). For the auditory component of the stimuli, there were two conditions of intelligibility (High versus Low) as determined by the signal-to-noise (SN) ratio. The control task was visual talker identification of still faces. In the Low intelligibility condition in which the auditory component of the speech was harder to hear, the visual influence was much stronger. Brain imaging data showed bilateral activations specific to the unimodal auditory stimuli (in the temporal cortex) and visual stimuli (in the MT/V5). For the bimodal audiovisual stimuli, activation in the left temporal cortex extended more posteriorly toward the visual-specific area in the Low intelligibility condition. The direct comparison between the Low and High audiovisual conditions showed increased activations in the posterior part of the left superior temporal sulcus (STS), indicating its relationship with the stronger visual influence. It was discussed that this region is likely to be involved in cross-modal binding of auditory-visual speech.
KW - Auditory-visual integration
KW - Cross-modal binding
KW - fMRI
KW - PET
KW - Speech perception
KW - Superior temporal sulcus
KW - The McGurk effect
UR - http://www.scopus.com/inward/record.url?scp=0142042913&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0142042913&partnerID=8YFLogxK
U2 - 10.1016/S0168-0102(03)00214-1
DO - 10.1016/S0168-0102(03)00214-1
M3 - Article
C2 - 14568109
AN - SCOPUS:0142042913
VL - 47
SP - 277
EP - 287
JO - Neuroscience Research
JF - Neuroscience Research
SN - 0168-0102
IS - 3
ER -