Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition

Kazushi Ishihara, Yuya Hattori, Tomohiro Nakatani, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Onomatopoeia, or sound-imitation words (SIWs) are important in informing sound events in human-computer communication. One problem is listener-dependency in recognizing environmental sounds by means of SIWs, that is, different listener hears the same environmental sound as a different SIW even under the same condition. Therefore, the use of usual Japanese phonemes is not adequate to express SIWs. To cope with this ambiguity problem of phoneme determination, we designed a set of new phonemes, referred to as the basic phoneme-groups, to represent environmental sounds. The basic phoneme-group consists of one or more Japanese phonemes, and thus the ambiguity problem is resolved based on it by generating one or more SIWs for a sound event. An HMM-based scheme is adopted to recognize SIWs using the phoneme-groups. Listening experiments with seven subjects showed that automatic SIW recognition based on the basic phoneme-groups outperformed ones based on the other types of phonemes. The recall and precision rate were 56.4% and 72.2%, respectively.

Original languageEnglish
Title of host publication8th International Conference on Spoken Language Processing, ICSLP 2004
PublisherInternational Speech Communication Association
Pages1485-1488
Number of pages4
Publication statusPublished - 2004
Externally publishedYes
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: 2004 Oct 42004 Oct 8

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
CountryKorea, Republic of
CityJeju, Jeju Island
Period04/10/404/10/8

Fingerprint

imitation
listener
Group
event
Sound
Phoneme
Imitation
Disambiguation
communication
experiment

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Ishihara, K., Hattori, Y., Nakatani, T., Komatani, K., Ogata, T., & Okuno, H. G. (2004). Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition. In 8th International Conference on Spoken Language Processing, ICSLP 2004 (pp. 1485-1488). International Speech Communication Association.

Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition. / Ishihara, Kazushi; Hattori, Yuya; Nakatani, Tomohiro; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

8th International Conference on Spoken Language Processing, ICSLP 2004. International Speech Communication Association, 2004. p. 1485-1488.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ishihara, K, Hattori, Y, Nakatani, T, Komatani, K, Ogata, T & Okuno, HG 2004, Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition. in 8th International Conference on Spoken Language Processing, ICSLP 2004. International Speech Communication Association, pp. 1485-1488, 8th International Conference on Spoken Language Processing, ICSLP 2004, Jeju, Jeju Island, Korea, Republic of, 04/10/4.
Ishihara K, Hattori Y, Nakatani T, Komatani K, Ogata T, Okuno HG. Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition. In 8th International Conference on Spoken Language Processing, ICSLP 2004. International Speech Communication Association. 2004. p. 1485-1488
Ishihara, Kazushi ; Hattori, Yuya ; Nakatani, Tomohiro ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition. 8th International Conference on Spoken Language Processing, ICSLP 2004. International Speech Communication Association, 2004. pp. 1485-1488
@inproceedings{aeeaf20d529347d5b3ff8ee6204012d5,
title = "Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition",
abstract = "Onomatopoeia, or sound-imitation words (SIWs) are important in informing sound events in human-computer communication. One problem is listener-dependency in recognizing environmental sounds by means of SIWs, that is, different listener hears the same environmental sound as a different SIW even under the same condition. Therefore, the use of usual Japanese phonemes is not adequate to express SIWs. To cope with this ambiguity problem of phoneme determination, we designed a set of new phonemes, referred to as the basic phoneme-groups, to represent environmental sounds. The basic phoneme-group consists of one or more Japanese phonemes, and thus the ambiguity problem is resolved based on it by generating one or more SIWs for a sound event. An HMM-based scheme is adopted to recognize SIWs using the phoneme-groups. Listening experiments with seven subjects showed that automatic SIW recognition based on the basic phoneme-groups outperformed ones based on the other types of phonemes. The recall and precision rate were 56.4{\%} and 72.2{\%}, respectively.",
author = "Kazushi Ishihara and Yuya Hattori and Tomohiro Nakatani and Kazunori Komatani and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2004",
language = "English",
pages = "1485--1488",
booktitle = "8th International Conference on Spoken Language Processing, ICSLP 2004",
publisher = "International Speech Communication Association",

}

TY - GEN

T1 - Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition

AU - Ishihara, Kazushi

AU - Hattori, Yuya

AU - Nakatani, Tomohiro

AU - Komatani, Kazunori

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2004

Y1 - 2004

N2 - Onomatopoeia, or sound-imitation words (SIWs) are important in informing sound events in human-computer communication. One problem is listener-dependency in recognizing environmental sounds by means of SIWs, that is, different listener hears the same environmental sound as a different SIW even under the same condition. Therefore, the use of usual Japanese phonemes is not adequate to express SIWs. To cope with this ambiguity problem of phoneme determination, we designed a set of new phonemes, referred to as the basic phoneme-groups, to represent environmental sounds. The basic phoneme-group consists of one or more Japanese phonemes, and thus the ambiguity problem is resolved based on it by generating one or more SIWs for a sound event. An HMM-based scheme is adopted to recognize SIWs using the phoneme-groups. Listening experiments with seven subjects showed that automatic SIW recognition based on the basic phoneme-groups outperformed ones based on the other types of phonemes. The recall and precision rate were 56.4% and 72.2%, respectively.

AB - Onomatopoeia, or sound-imitation words (SIWs) are important in informing sound events in human-computer communication. One problem is listener-dependency in recognizing environmental sounds by means of SIWs, that is, different listener hears the same environmental sound as a different SIW even under the same condition. Therefore, the use of usual Japanese phonemes is not adequate to express SIWs. To cope with this ambiguity problem of phoneme determination, we designed a set of new phonemes, referred to as the basic phoneme-groups, to represent environmental sounds. The basic phoneme-group consists of one or more Japanese phonemes, and thus the ambiguity problem is resolved based on it by generating one or more SIWs for a sound event. An HMM-based scheme is adopted to recognize SIWs using the phoneme-groups. Listening experiments with seven subjects showed that automatic SIW recognition based on the basic phoneme-groups outperformed ones based on the other types of phonemes. The recall and precision rate were 56.4% and 72.2%, respectively.

UR - http://www.scopus.com/inward/record.url?scp=85009062543&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009062543&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85009062543

SP - 1485

EP - 1488

BT - 8th International Conference on Spoken Language Processing, ICSLP 2004

PB - International Speech Communication Association

ER -