Speech-to-image media conversion based on VQ and neural network

Shigeo Morishima, Hiroshi Harashima

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Automatic media conversion schemes from speech to a facial image and a construction of a real-time image synthesis system are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with synthesized human face images. A human face image is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized by transformation of the 3-D model. In the motion driving method, based on vector quantization and the neural network, the synthesized head image can appear to speak some given words and phrases naturally, in synchronization with voice signals from a speaker.

Original languageEnglish
Title of host publicationProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Editors Anon
PublisherPubl by IEEE
Pages2865-2868
Number of pages4
ISBN (Print)078030033
Publication statusPublished - 1991 Dec 1
Externally publishedYes
EventProceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91 - Toronto, Ont, Can
Duration: 1991 May 141991 May 17

Publication series

NameProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Volume4
ISSN (Print)0736-7791

Other

OtherProceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91
CityToronto, Ont, Can
Period91/5/1491/5/17

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Speech-to-image media conversion based on VQ and neural network'. Together they form a unique fingerprint.

Cite this