Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition

Kazushi Ishihara, Yuya Hattori, Tomohiro Nakatani, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Onomatopoeia, or sound-imitation words (SIWs) are important in informing sound events in human-computer communication. One problem is listener-dependency in recognizing environmental sounds by means of SIWs, that is, different listener hears the same environmental sound as a different SIW even under the same condition. Therefore, the use of usual Japanese phonemes is not adequate to express SIWs. To cope with this ambiguity problem of phoneme determination, we designed a set of new phonemes, referred to as the basic phoneme-groups, to represent environmental sounds. The basic phoneme-group consists of one or more Japanese phonemes, and thus the ambiguity problem is resolved based on it by generating one or more SIWs for a sound event. An HMM-based scheme is adopted to recognize SIWs using the phoneme-groups. Listening experiments with seven subjects showed that automatic SIW recognition based on the basic phoneme-groups outperformed ones based on the other types of phonemes. The recall and precision rate were 56.4% and 72.2%, respectively.

Original languageEnglish
Title of host publication8th International Conference on Spoken Language Processing, ICSLP 2004
PublisherInternational Speech Communication Association
Pages1485-1488
Number of pages4
Publication statusPublished - 2004
Externally publishedYes
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: 2004 Oct 42004 Oct 8

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
CountryKorea, Republic of
CityJeju, Jeju Island
Period04/10/404/10/8

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition'. Together they form a unique fingerprint.

  • Cite this

    Ishihara, K., Hattori, Y., Nakatani, T., Komatani, K., Ogata, T., & Okuno, H. G. (2004). Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition. In 8th International Conference on Spoken Language Processing, ICSLP 2004 (pp. 1485-1488). International Speech Communication Association.