Speech coding based on a multi-layer neural network

Shigeo Morishima*, Hiroshi Harashima, Yasuo Katayama

*この研究の対応する著者

研究成果査読

3 被引用数 (Scopus)

抄録

The authors present a speech-compression scheme based on a three-layer perceptron in which the number of units in the hidden layer is reduced. Input and output layers have the same number of units in order to achieve identity mapping. Speech coding is realized by scalar or vector quantization of hidden-layer outputs. By analyzing the weighting coefficients, it can be shown that speech coding based on a three-layer neural network is speaker-independent. Transform coding is automatically based on back propagation. The relation between compression ratio and SNR (signal-to-noise ratio) is investigated. The bit allocation and optimum number of hidden-layer units necessary to realize a specific bit rate are given. According to the analysis of weighting coefficients, speech coding based on a neural network is transform coding similar to Karhunen-Loeve transformation. The characteristics of a five-layer neural network are examined. It is shown that since the five-layer neural network can realize nonlinear mapping, it is more effective than the three-layer network.

本文言語English
ページ(範囲)429-433
ページ数5
ジャーナルConference Record - International Conference on Communications
2
出版ステータスPublished - 1990 12 1
外部発表はい
イベントIEEE International Conference on Communications - ICC '90 Part 2 (of 4) - Atlanta, GA, USA
継続期間: 1990 4 161990 4 19

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • 電子工学および電気工学

フィンガープリント

「Speech coding based on a multi-layer neural network」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル