Speech-to-image media conversion based on VQ and neural network

Shigeo Morishima, Hiroshi Harashima

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Automatic media conversion schemes from speech to a facial image and a construction of a real-time image synthesis system are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with synthesized human face images. A human face image is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized by transformation of the 3-D model. In the motion driving method, based on vector quantization and the neural network, the synthesized head image can appear to speak some given words and phrases naturally, in synchronization with voice signals from a speaker.

Original languageEnglish
Title of host publicationProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Editors Anon
Place of PublicationPiscataway, NJ, United States
PublisherPubl by IEEE
Pages2865-2868
Number of pages4
Volume4
ISBN (Print)078030033
Publication statusPublished - 1991
Externally publishedYes
EventProceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91 - Toronto, Ont, Can
Duration: 1991 May 141991 May 17

Other

OtherProceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91
CityToronto, Ont, Can
Period91/5/1491/5/17

    Fingerprint

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

Morishima, S., & Harashima, H. (1991). Speech-to-image media conversion based on VQ and neural network. In Anon (Ed.), Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing (Vol. 4, pp. 2865-2868). Publ by IEEE.