Summary form only given, as follows. A new speech compression technique for voice storage or voice mail is presented. Basically the coding scheme of this system is stochastic coding (CELP), but the results of phoneme recognition and segmentation are utilized as the standard for vector quantization (VQ) codebook selection and voiced-unvoiced control. The recognition process is performed using the heuristic knowledge to decide nine phonemes. Codebooks for both PARCOR coefficients and excitations for each phoneme are trained by a 75 spoken word sequence that includes all the VCV patterns. The phoneme code number is quantized at the beginning of each segment to select the optimum codebooks and strategies for that segment. This scheme can be categorized as multiple-stage VQ. Thus the size of each codebook is very small and the length of each segment is very long. Very-low-bit-rate coding with high quality can be realized, and a special procedure can be performed to increase the intelligibility. In the case where the average bit rate is 860 b/s, the experimental results show that the average segmental SNR is 6.30 dB, and a subjective test indicates good intelligibility and phoneme clarity.
|出版ステータス||Published - 1988 12月 1|
ASJC Scopus subject areas