A word recognition system using network representation of acoustic phonetic segments is proposed and tested on a speaker-independent isolated word recognition task. The system has four major features, (1) context dependent segment modeling, (2) rule-based generation of networks of acoustic phonetic segments, (3) direct matching of input speech against networks and (4) usage of mel-cepstra and matrix representation of segments. Average error rates for the 10 male speakers were 0.6% for the 53 city names and 2.3% for the 220 words. They were 1/5 and 1/3 of those by the whole-word multiple template method, respectively. These results show the effectiveness of the proposed method.
|ジャーナル||Denshi Gijutsu Sogo Kenkyusho Iho/Bulletin of the Electrotechnical Laboratory|
|出版ステータス||Published - 1990|
ASJC Scopus subject areas