Speech recognition technology combined with three dimensional lip movement

K. Komiya, R. Ishikawa, Keiko Momose

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In order to improve speech recognition efficiency under background noise such as out of doors, we propose a new recognition technology combined with three dimensional lip movements. In this paper firstly, three-dimensional movements at four positions of the mouth were measured using principal component analysis to clarify which positions and which directions are main contributors to pronunciation. Secondly, recognition evaluation tests for 50 Japanese words were carried out under noise levels ranging from 40 to 80 dB. In the experiment, over 80% of recognition efficiency was measured at 70dB and improvement of 40% was obtained compared with ordinary speech recognition. From the experimental results, the proposed method can be modified to be used as practical speech recognition technology. Finally, research subjects were picked up such as an improvement in precision of measuring lip movement and experiments and data collection out of doors.

Original languageEnglish
Title of host publicationProceedings of SPIE - The International Society for Optical Engineering
EditorsB.D. Corner, J.H. Nurre, R.P. Pargas
Pages95-102
Number of pages8
Volume4298
DOIs
Publication statusPublished - 2001
Externally publishedYes
EventThree-Dimensional Image Capture and Applications IV - San Jose, CA, United States
Duration: 2001 Jan 242001 Jan 25

Other

OtherThree-Dimensional Image Capture and Applications IV
CountryUnited States
CitySan Jose, CA
Period01/1/2401/1/25

Fingerprint

speech recognition
Speech recognition
mouth
background noise
principal components analysis
Acoustic noise
Principal component analysis
Experiments
evaluation

Keywords

  • Background noise
  • Principal component analysis
  • Recognition efficiency
  • Three dimensional lip movement

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Condensed Matter Physics

Cite this

Komiya, K., Ishikawa, R., & Momose, K. (2001). Speech recognition technology combined with three dimensional lip movement. In B. D. Corner, J. H. Nurre, & R. P. Pargas (Eds.), Proceedings of SPIE - The International Society for Optical Engineering (Vol. 4298, pp. 95-102) https://doi.org/10.1117/12.424893

Speech recognition technology combined with three dimensional lip movement. / Komiya, K.; Ishikawa, R.; Momose, Keiko.

Proceedings of SPIE - The International Society for Optical Engineering. ed. / B.D. Corner; J.H. Nurre; R.P. Pargas. Vol. 4298 2001. p. 95-102.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Komiya, K, Ishikawa, R & Momose, K 2001, Speech recognition technology combined with three dimensional lip movement. in BD Corner, JH Nurre & RP Pargas (eds), Proceedings of SPIE - The International Society for Optical Engineering. vol. 4298, pp. 95-102, Three-Dimensional Image Capture and Applications IV, San Jose, CA, United States, 01/1/24. https://doi.org/10.1117/12.424893
Komiya K, Ishikawa R, Momose K. Speech recognition technology combined with three dimensional lip movement. In Corner BD, Nurre JH, Pargas RP, editors, Proceedings of SPIE - The International Society for Optical Engineering. Vol. 4298. 2001. p. 95-102 https://doi.org/10.1117/12.424893
Komiya, K. ; Ishikawa, R. ; Momose, Keiko. / Speech recognition technology combined with three dimensional lip movement. Proceedings of SPIE - The International Society for Optical Engineering. editor / B.D. Corner ; J.H. Nurre ; R.P. Pargas. Vol. 4298 2001. pp. 95-102
@inproceedings{3cb3e81dd4004b7b8976cf3f039416dc,
title = "Speech recognition technology combined with three dimensional lip movement",
abstract = "In order to improve speech recognition efficiency under background noise such as out of doors, we propose a new recognition technology combined with three dimensional lip movements. In this paper firstly, three-dimensional movements at four positions of the mouth were measured using principal component analysis to clarify which positions and which directions are main contributors to pronunciation. Secondly, recognition evaluation tests for 50 Japanese words were carried out under noise levels ranging from 40 to 80 dB. In the experiment, over 80{\%} of recognition efficiency was measured at 70dB and improvement of 40{\%} was obtained compared with ordinary speech recognition. From the experimental results, the proposed method can be modified to be used as practical speech recognition technology. Finally, research subjects were picked up such as an improvement in precision of measuring lip movement and experiments and data collection out of doors.",
keywords = "Background noise, Principal component analysis, Recognition efficiency, Three dimensional lip movement",
author = "K. Komiya and R. Ishikawa and Keiko Momose",
year = "2001",
doi = "10.1117/12.424893",
language = "English",
volume = "4298",
pages = "95--102",
editor = "B.D. Corner and J.H. Nurre and R.P. Pargas",
booktitle = "Proceedings of SPIE - The International Society for Optical Engineering",

}

TY - GEN

T1 - Speech recognition technology combined with three dimensional lip movement

AU - Komiya, K.

AU - Ishikawa, R.

AU - Momose, Keiko

PY - 2001

Y1 - 2001

N2 - In order to improve speech recognition efficiency under background noise such as out of doors, we propose a new recognition technology combined with three dimensional lip movements. In this paper firstly, three-dimensional movements at four positions of the mouth were measured using principal component analysis to clarify which positions and which directions are main contributors to pronunciation. Secondly, recognition evaluation tests for 50 Japanese words were carried out under noise levels ranging from 40 to 80 dB. In the experiment, over 80% of recognition efficiency was measured at 70dB and improvement of 40% was obtained compared with ordinary speech recognition. From the experimental results, the proposed method can be modified to be used as practical speech recognition technology. Finally, research subjects were picked up such as an improvement in precision of measuring lip movement and experiments and data collection out of doors.

AB - In order to improve speech recognition efficiency under background noise such as out of doors, we propose a new recognition technology combined with three dimensional lip movements. In this paper firstly, three-dimensional movements at four positions of the mouth were measured using principal component analysis to clarify which positions and which directions are main contributors to pronunciation. Secondly, recognition evaluation tests for 50 Japanese words were carried out under noise levels ranging from 40 to 80 dB. In the experiment, over 80% of recognition efficiency was measured at 70dB and improvement of 40% was obtained compared with ordinary speech recognition. From the experimental results, the proposed method can be modified to be used as practical speech recognition technology. Finally, research subjects were picked up such as an improvement in precision of measuring lip movement and experiments and data collection out of doors.

KW - Background noise

KW - Principal component analysis

KW - Recognition efficiency

KW - Three dimensional lip movement

UR - http://www.scopus.com/inward/record.url?scp=0034935821&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034935821&partnerID=8YFLogxK

U2 - 10.1117/12.424893

DO - 10.1117/12.424893

M3 - Conference contribution

VL - 4298

SP - 95

EP - 102

BT - Proceedings of SPIE - The International Society for Optical Engineering

A2 - Corner, B.D.

A2 - Nurre, J.H.

A2 - Pargas, R.P.

ER -