Spectral Mapping onto Probabilistic Domain Using Neural Networks and Its Application to Speaker Adaptive Phoneme Recognition

T. Kobayashi, K. Shirai

Research output: Contribution to conferencePaperpeer-review

Abstract

A feature parameter space called PRPG (Probability Ratios between Phoneme Group pairs) is utilized for speaker adaptive phoneme recognition. The coordinate conversion is performed by neural networks. Each outputnode of the network represents a posteriori probability of phoneme group. Therefore, distance in the PRPG coordinate system corresponds directly to the difference of likelihood. The area with the same information for speech recognition is compressed into one point. Moreover, by the definition of the coordinate system, the meaning of axes are equivalent among different speakers, so the speaker adaptation can be easily performed without trajectory mapping. The experimental results show that the scores of the speaker-adaptive recognition in the PRPG domain are always superior to those of the speaker-dependent recognition in the spectral domain.

Original languageEnglish
Pages385-388
Number of pages4
Publication statusPublished - 1992
Event2nd International Conference on Spoken Language Processing, ICSLP 1992 - Banff, Canada
Duration: 1992 Oct 131992 Oct 16

Conference

Conference2nd International Conference on Spoken Language Processing, ICSLP 1992
Country/TerritoryCanada
CityBanff
Period92/10/1392/10/16

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Spectral Mapping onto Probabilistic Domain Using Neural Networks and Its Application to Speaker Adaptive Phoneme Recognition'. Together they form a unique fingerprint.

Cite this