Very low bit rate speech coding based on a phoneme recognition

Shigeo Morishima, Hiroshi Harashima

Research output: Contribution to conferencePaper

Abstract

Summary form only given, as follows. A new speech compression technique for voice storage or voice mail is presented. Basically the coding scheme of this system is stochastic coding (CELP), but the results of phoneme recognition and segmentation are utilized as the standard for vector quantization (VQ) codebook selection and voiced-unvoiced control. The recognition process is performed using the heuristic knowledge to decide nine phonemes. Codebooks for both PARCOR coefficients and excitations for each phoneme are trained by a 75 spoken word sequence that includes all the VCV patterns. The phoneme code number is quantized at the beginning of each segment to select the optimum codebooks and strategies for that segment. This scheme can be categorized as multiple-stage VQ. Thus the size of each codebook is very small and the length of each segment is very long. Very-low-bit-rate coding with high quality can be realized, and a special procedure can be performed to increase the intelligibility. In the case where the average bit rate is 860 b/s, the experimental results show that the average segmental SNR is 6.30 dB, and a subjective test indicates good intelligibility and phoneme clarity.

Original languageEnglish
Pages71-72
Number of pages2
Publication statusPublished - 1988 Dec 1

    Fingerprint

ASJC Scopus subject areas

  • Engineering(all)

Cite this