Speech-to-image media conversion based on VQ and neural network

Shigeo Morishima, Hiroshi Harashima

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Automatic media conversion schemes from speech to a facial image and a construction of a real-time image synthesis system are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with synthesized human face images. A human face image is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized by transformation of the 3-D model. In the motion driving method, based on vector quantization and the neural network, the synthesized head image can appear to speak some given words and phrases naturally, in synchronization with voice signals from a speaker.

Original languageEnglish
Title of host publicationProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Editors Anon
Place of PublicationPiscataway, NJ, United States
PublisherPubl by IEEE
Pages2865-2868
Number of pages4
Volume4
ISBN (Print)078030033
Publication statusPublished - 1991
Externally publishedYes
EventProceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91 - Toronto, Ont, Can
Duration: 1991 May 141991 May 17

Other

OtherProceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91
CityToronto, Ont, Can
Period91/5/1491/5/17

Fingerprint

Neural networks
Vector quantization
Communication systems
Synchronization
Textures
Display devices
vector quantization
telecommunication
synchronism
textures
synthesis

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

Morishima, S., & Harashima, H. (1991). Speech-to-image media conversion based on VQ and neural network. In Anon (Ed.), Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing (Vol. 4, pp. 2865-2868). Piscataway, NJ, United States: Publ by IEEE.

Speech-to-image media conversion based on VQ and neural network. / Morishima, Shigeo; Harashima, Hiroshi.

Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. ed. / Anon. Vol. 4 Piscataway, NJ, United States : Publ by IEEE, 1991. p. 2865-2868.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Morishima, S & Harashima, H 1991, Speech-to-image media conversion based on VQ and neural network. in Anon (ed.), Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. vol. 4, Publ by IEEE, Piscataway, NJ, United States, pp. 2865-2868, Proceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing - ICASSP 91, Toronto, Ont, Can, 91/5/14.
Morishima S, Harashima H. Speech-to-image media conversion based on VQ and neural network. In Anon, editor, Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. Vol. 4. Piscataway, NJ, United States: Publ by IEEE. 1991. p. 2865-2868
Morishima, Shigeo ; Harashima, Hiroshi. / Speech-to-image media conversion based on VQ and neural network. Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing. editor / Anon. Vol. 4 Piscataway, NJ, United States : Publ by IEEE, 1991. pp. 2865-2868
@inproceedings{3cc969af737147c8b9d3c1cca114ef92,
title = "Speech-to-image media conversion based on VQ and neural network",
abstract = "Automatic media conversion schemes from speech to a facial image and a construction of a real-time image synthesis system are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with synthesized human face images. A human face image is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized by transformation of the 3-D model. In the motion driving method, based on vector quantization and the neural network, the synthesized head image can appear to speak some given words and phrases naturally, in synchronization with voice signals from a speaker.",
author = "Shigeo Morishima and Hiroshi Harashima",
year = "1991",
language = "English",
isbn = "078030033",
volume = "4",
pages = "2865--2868",
editor = "Anon",
booktitle = "Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing",
publisher = "Publ by IEEE",

}

TY - GEN

T1 - Speech-to-image media conversion based on VQ and neural network

AU - Morishima, Shigeo

AU - Harashima, Hiroshi

PY - 1991

Y1 - 1991

N2 - Automatic media conversion schemes from speech to a facial image and a construction of a real-time image synthesis system are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with synthesized human face images. A human face image is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized by transformation of the 3-D model. In the motion driving method, based on vector quantization and the neural network, the synthesized head image can appear to speak some given words and phrases naturally, in synchronization with voice signals from a speaker.

AB - Automatic media conversion schemes from speech to a facial image and a construction of a real-time image synthesis system are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with synthesized human face images. A human face image is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized by transformation of the 3-D model. In the motion driving method, based on vector quantization and the neural network, the synthesized head image can appear to speak some given words and phrases naturally, in synchronization with voice signals from a speaker.

UR - http://www.scopus.com/inward/record.url?scp=0026396368&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0026396368&partnerID=8YFLogxK

M3 - Conference contribution

SN - 078030033

VL - 4

SP - 2865

EP - 2868

BT - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

A2 - Anon, null

PB - Publ by IEEE

CY - Piscataway, NJ, United States

ER -