Model-based lip synchronization with automatically translated synthetic voice toward a multi-modal translation system

Shin Ogata, Kazumasa Murai, Satoshi Nakamura, Shigeo Morishima

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

In this paper, we introduce a multi-modal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion while synchronizing it to the translated speech. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a three-dimensional wire-frame model that is adaptable to any speaker. Our approach enables image synthesis and translation with an extremely small database.

Original languageEnglish
Title of host publicationProceedings - IEEE International Conference on Multimedia and Expo
PublisherIEEE Computer Society
Pages28-31
Number of pages4
ISBN (Electronic)0769511988
DOIs
Publication statusPublished - 2001 Jan 1
Event2001 IEEE International Conference on Multimedia and Expo, ICME 2001 - Tokyo, Japan
Duration: 2001 Aug 222001 Aug 25

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Other

Other2001 IEEE International Conference on Multimedia and Expo, ICME 2001
CountryJapan
CityTokyo
Period01/8/2201/8/25

    Fingerprint

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Cite this

Ogata, S., Murai, K., Nakamura, S., & Morishima, S. (2001). Model-based lip synchronization with automatically translated synthetic voice toward a multi-modal translation system. In Proceedings - IEEE International Conference on Multimedia and Expo (pp. 28-31). [1237647] (Proceedings - IEEE International Conference on Multimedia and Expo). IEEE Computer Society. https://doi.org/10.1109/ICME.2001.1237647