Audio-visual interaction in model adaptation for multi-modal speech recognition

Satoshi Tamura, Masanao Oonishi, Satoru Hayamizu

Research output: Contribution to conferencePaperpeer-review

4 Citations (Scopus)

Abstract

This paper investigates audio-visual interaction, i.e. inter-modal influences, in linear-regressive model adaptation for multi-modal speech recognition. In the multi-modal adaptation, inter-modal information may contribute the performance of speech recognition. Thus the influence and advantage of intermodal elements should be examined. Experiments were conducted to evaluate several transformation matrices including or excluding inter-modal and intra-modal elements, using noisy data in an audio-visual corpus. From the experimental results, the importance of effective use of audio-visual interaction is clarified.

Original languageEnglish
Pages875-878
Number of pages4
Publication statusPublished - 2011
Externally publishedYes
EventAsia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011 - Xi'an, China
Duration: 2011 Oct 182011 Oct 21

Conference

ConferenceAsia-Pacific Signal and Information Processing Association Annual Summit and Conference 2011, APSIPA ASC 2011
CountryChina
CityXi'an
Period11/10/1811/10/21

ASJC Scopus subject areas

  • Information Systems
  • Signal Processing

Fingerprint Dive into the research topics of 'Audio-visual interaction in model adaptation for multi-modal speech recognition'. Together they form a unique fingerprint.

Cite this