A media conversion from speech to facial image for intelligent man-machine interface

Shigeo Morishima, Hiroshi Harashima

研究成果: Article

65 引用 (Scopus)

抄録

An automatic field motion image synthesis scheme (driven by speech) and a real-time image synthesis design are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with talking head images. A human face is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized naturally by transformation of the lattice points on 3-D wire frames. Two driving motion methods, a text-to-image conversion scheme and a voice-to-image conversion scheme, are proposed. In the first method, the synthesized head image can appear to speak some given words and phrases naturally. In the second case, some mouth and jaw motions can be synthesized in synchronization with voice signals from a speaker. Facial expressions other than mouth shape and jaw position can be added at any moment, so it is easy to make the facial model appear angry, to smile, to appear sad, etc., by special modification rules. These schemes were implemented on a parallel image computer system. A real-time image synthesizer was able to generate facial motion images on the display at a TV image video rate.

元の言語English
ページ(範囲)594-600
ページ数7
ジャーナルIEEE Journal on Selected Areas in Communications
9
発行部数4
DOI
出版物ステータスPublished - 1991 5
外部発表Yes

Fingerprint

Display devices
Communication systems
Synchronization
Computer systems
Textures
Wire

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Electrical and Electronic Engineering

これを引用

@article{9a2853654dc84513962054da87079ebd,
title = "A media conversion from speech to facial image for intelligent man-machine interface",
abstract = "An automatic field motion image synthesis scheme (driven by speech) and a real-time image synthesis design are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with talking head images. A human face is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized naturally by transformation of the lattice points on 3-D wire frames. Two driving motion methods, a text-to-image conversion scheme and a voice-to-image conversion scheme, are proposed. In the first method, the synthesized head image can appear to speak some given words and phrases naturally. In the second case, some mouth and jaw motions can be synthesized in synchronization with voice signals from a speaker. Facial expressions other than mouth shape and jaw position can be added at any moment, so it is easy to make the facial model appear angry, to smile, to appear sad, etc., by special modification rules. These schemes were implemented on a parallel image computer system. A real-time image synthesizer was able to generate facial motion images on the display at a TV image video rate.",
author = "Shigeo Morishima and Hiroshi Harashima",
year = "1991",
month = "5",
doi = "10.1109/49.81953",
language = "English",
volume = "9",
pages = "594--600",
journal = "IEEE Journal on Selected Areas in Communications",
issn = "0733-8716",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "4",

}

TY - JOUR

T1 - A media conversion from speech to facial image for intelligent man-machine interface

AU - Morishima, Shigeo

AU - Harashima, Hiroshi

PY - 1991/5

Y1 - 1991/5

N2 - An automatic field motion image synthesis scheme (driven by speech) and a real-time image synthesis design are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with talking head images. A human face is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized naturally by transformation of the lattice points on 3-D wire frames. Two driving motion methods, a text-to-image conversion scheme and a voice-to-image conversion scheme, are proposed. In the first method, the synthesized head image can appear to speak some given words and phrases naturally. In the second case, some mouth and jaw motions can be synthesized in synchronization with voice signals from a speaker. Facial expressions other than mouth shape and jaw position can be added at any moment, so it is easy to make the facial model appear angry, to smile, to appear sad, etc., by special modification rules. These schemes were implemented on a parallel image computer system. A real-time image synthesizer was able to generate facial motion images on the display at a TV image video rate.

AB - An automatic field motion image synthesis scheme (driven by speech) and a real-time image synthesis design are presented. The purpose of this research is to realize an intelligent human-machine interface or intelligent communication system with talking head images. A human face is reconstructed on the display of a terminal using a 3-D surface model and texture mapping technique. Facial motion images are synthesized naturally by transformation of the lattice points on 3-D wire frames. Two driving motion methods, a text-to-image conversion scheme and a voice-to-image conversion scheme, are proposed. In the first method, the synthesized head image can appear to speak some given words and phrases naturally. In the second case, some mouth and jaw motions can be synthesized in synchronization with voice signals from a speaker. Facial expressions other than mouth shape and jaw position can be added at any moment, so it is easy to make the facial model appear angry, to smile, to appear sad, etc., by special modification rules. These schemes were implemented on a parallel image computer system. A real-time image synthesizer was able to generate facial motion images on the display at a TV image video rate.

UR - http://www.scopus.com/inward/record.url?scp=0026156861&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0026156861&partnerID=8YFLogxK

U2 - 10.1109/49.81953

DO - 10.1109/49.81953

M3 - Article

AN - SCOPUS:0026156861

VL - 9

SP - 594

EP - 600

JO - IEEE Journal on Selected Areas in Communications

JF - IEEE Journal on Selected Areas in Communications

SN - 0733-8716

IS - 4

ER -