Intelligent facial image coding driven by speech and phoneme

Shigeo Morishima, Kiyoharu Aizawa, Hiroshi Harashima

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Citations (Scopus)

Abstract

The authors propose and compare two types of model-based facial motion coding schemes, i.e., synthesis by rules and synthesis by parameters. In synthesis by rules, facial motion images are synthesized on the basis of rules extracted by analysis of training image samples that include all of the phonemes and coarticulation. This system can be utilized as an automatic facial animation synthesizer from text input or as a man-machine interface using the facial motion image. In synthesis by parameters, facial motion images are synthesized on the basis of a code word index of speech parameters. Experimental results indicate good performance for both systems, which can create natural facial-motion images with very low transmission rate. Details of 3-D modeling, algorithm synthesis, and performance are discussed.

Original languageEnglish
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Editors Anon
PublisherPubl by IEEE
Pages1795-1798
Number of pages4
Volume3
Publication statusPublished - 1989
Externally publishedYes
Event1989 International Conference on Acoustics, Speech, and Signal Processing - Glasgow, Scotland
Duration: 1989 May 231989 May 26

Other

Other1989 International Conference on Acoustics, Speech, and Signal Processing
CityGlasgow, Scotland
Period89/5/2389/5/26

Fingerprint

phonemes
Image coding
coding
synthesis
animation
synthesizers
Animation
education

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

Morishima, S., Aizawa, K., & Harashima, H. (1989). Intelligent facial image coding driven by speech and phoneme. In Anon (Ed.), ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 3, pp. 1795-1798). Publ by IEEE.

Intelligent facial image coding driven by speech and phoneme. / Morishima, Shigeo; Aizawa, Kiyoharu; Harashima, Hiroshi.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. ed. / Anon. Vol. 3 Publ by IEEE, 1989. p. 1795-1798.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Morishima, S, Aizawa, K & Harashima, H 1989, Intelligent facial image coding driven by speech and phoneme. in Anon (ed.), ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 3, Publ by IEEE, pp. 1795-1798, 1989 International Conference on Acoustics, Speech, and Signal Processing, Glasgow, Scotland, 89/5/23.
Morishima S, Aizawa K, Harashima H. Intelligent facial image coding driven by speech and phoneme. In Anon, editor, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 3. Publ by IEEE. 1989. p. 1795-1798
Morishima, Shigeo ; Aizawa, Kiyoharu ; Harashima, Hiroshi. / Intelligent facial image coding driven by speech and phoneme. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. editor / Anon. Vol. 3 Publ by IEEE, 1989. pp. 1795-1798
@inproceedings{c74276e3bdc847bd90bc4153b1fed22f,
title = "Intelligent facial image coding driven by speech and phoneme",
abstract = "The authors propose and compare two types of model-based facial motion coding schemes, i.e., synthesis by rules and synthesis by parameters. In synthesis by rules, facial motion images are synthesized on the basis of rules extracted by analysis of training image samples that include all of the phonemes and coarticulation. This system can be utilized as an automatic facial animation synthesizer from text input or as a man-machine interface using the facial motion image. In synthesis by parameters, facial motion images are synthesized on the basis of a code word index of speech parameters. Experimental results indicate good performance for both systems, which can create natural facial-motion images with very low transmission rate. Details of 3-D modeling, algorithm synthesis, and performance are discussed.",
author = "Shigeo Morishima and Kiyoharu Aizawa and Hiroshi Harashima",
year = "1989",
language = "English",
volume = "3",
pages = "1795--1798",
editor = "Anon",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Publ by IEEE",

}

TY - GEN

T1 - Intelligent facial image coding driven by speech and phoneme

AU - Morishima, Shigeo

AU - Aizawa, Kiyoharu

AU - Harashima, Hiroshi

PY - 1989

Y1 - 1989

N2 - The authors propose and compare two types of model-based facial motion coding schemes, i.e., synthesis by rules and synthesis by parameters. In synthesis by rules, facial motion images are synthesized on the basis of rules extracted by analysis of training image samples that include all of the phonemes and coarticulation. This system can be utilized as an automatic facial animation synthesizer from text input or as a man-machine interface using the facial motion image. In synthesis by parameters, facial motion images are synthesized on the basis of a code word index of speech parameters. Experimental results indicate good performance for both systems, which can create natural facial-motion images with very low transmission rate. Details of 3-D modeling, algorithm synthesis, and performance are discussed.

AB - The authors propose and compare two types of model-based facial motion coding schemes, i.e., synthesis by rules and synthesis by parameters. In synthesis by rules, facial motion images are synthesized on the basis of rules extracted by analysis of training image samples that include all of the phonemes and coarticulation. This system can be utilized as an automatic facial animation synthesizer from text input or as a man-machine interface using the facial motion image. In synthesis by parameters, facial motion images are synthesized on the basis of a code word index of speech parameters. Experimental results indicate good performance for both systems, which can create natural facial-motion images with very low transmission rate. Details of 3-D modeling, algorithm synthesis, and performance are discussed.

UR - http://www.scopus.com/inward/record.url?scp=0024900468&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0024900468&partnerID=8YFLogxK

M3 - Conference contribution

VL - 3

SP - 1795

EP - 1798

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

A2 - Anon, null

PB - Publ by IEEE

ER -