Abstract
This paper investigates audio-visual interaction, i.e. inter-modal influences, in linear-regressive model adaptation for multi-modal speech recognition. In the multi-modal adaptation, inter-modal information may contribute the performance of speech recognition. Thus the influence and advantage of intermodal elements should be examined. Experiments were conducted to evaluate several transformation matrices including or excluding inter-modal and intra-modal elements, using noisy data in an audio-visual corpus. From the experimental results, the importance of effective use of audio-visual interaction is clarified.
Original language | English |
---|---|
Pages | 875-878 |
Number of pages | 4 |
Publication status | Published - 2011 |
Externally published | Yes |
Event | Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 - Xi'an, China Duration: 2011 Oct 18 → 2011 Oct 21 |
Conference
Conference | Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 |
---|---|
Country/Territory | China |
City | Xi'an |
Period | 11/10/18 → 11/10/21 |
ASJC Scopus subject areas
- Information Systems
- Signal Processing