Speaker clustering for speech recognition using the parameters characterizing vocal-tract dimensions

Masaki Naito, Li Deng, Yoshinori Sagisaka

研究成果: Conference contribution

3 被引用数 (Scopus)

抄録

We propose speaker clustering methods based on the vocal-tract-size related articulatory parameters associated with individual speakers. Two parameters characterizing gross vocal-tract dimensions are first derived from formants of speaker-specific Japanese vowels, and are then used to cluster a total of 148 male Japanese speakers. The resultant speaker clusters are found to be significantly different from the speaker clusters obtained by conventional acoustic criteria. Japanese phoneme recognition experiments are carried out using speaker-clustered tied-state HMMs (HMNets) trained for each cluster. Compared with the baseline gender dependent model, 5.7% of recognition error reduction has been achieved based on the clustering method using vocal-tract parameters.

本文言語English
ホスト出版物のタイトルProceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998
ページ981-984
ページ数4
DOI
出版ステータスPublished - 1998 12 1
外部発表はい
イベント1998 23rd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998 - Seattle, WA, United States
継続期間: 1998 5 121998 5 15

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2
ISSN(印刷版)1520-6149

Conference

Conference1998 23rd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998
国/地域United States
CitySeattle, WA
Period98/5/1298/5/15

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「Speaker clustering for speech recognition using the parameters characterizing vocal-tract dimensions」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル