Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes

Kazushi Ishihara, Tomohiro Nakatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

Sound-imitation words (SIWs), or onomatopoeia, are important for computer human interactions and the automatic tagging of sound archives. The main problem in automatic SIW recognition is ambiguity in the determining phonemes, since different listener hears the same environmental sound as a different SIW even under the same situation. To solve this problem, we designed a set of new phonemes, called the basic phoneme-group set, to represent environmental sounds in addition to a set of the articulation-based phoneme-groups. Automatic SIW recognition based on Hidden Markov Model (HMM) with the basic phoneme-groups is allowed to generate plural SIWs in order to absorb ambiguities caused by listener- and situation-dependency. Listening experiments with seven subjects proved that automatic SIW recognition based on the basic phoneme-groups outperformed that based on the articulation-based phoneme-groups and that based on Japanese phonemes. The proposed system proved more adequate to use computer interactions.

Original languageEnglish
Title of host publicationLecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
EditorsC. Zhang, H.W. Guesgen, W.K. Yeap
Pages909-918
Number of pages10
Volume3157
Publication statusPublished - 2004
Externally publishedYes
Event8th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2004: Trends in Artificial Intelligence - Auckland, New Zealand
Duration: 2004 Aug 92004 Aug 13

Other

Other8th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2004: Trends in Artificial Intelligence
CountryNew Zealand
CityAuckland
Period04/8/904/8/13

Fingerprint

Acoustic waves
Hidden Markov models
Human computer interaction
Experiments

ASJC Scopus subject areas

  • Hardware and Architecture

Cite this

Ishihara, K., Nakatani, T., Ogata, T., & Okuno, H. G. (2004). Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes. In C. Zhang, H. W. Guesgen, & W. K. Yeap (Eds.), Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3157, pp. 909-918)

Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes. / Ishihara, Kazushi; Nakatani, Tomohiro; Ogata, Tetsuya; Okuno, Hiroshi G.

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). ed. / C. Zhang; H.W. Guesgen; W.K. Yeap. Vol. 3157 2004. p. 909-918.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ishihara, K, Nakatani, T, Ogata, T & Okuno, HG 2004, Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes. in C Zhang, HW Guesgen & WK Yeap (eds), Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). vol. 3157, pp. 909-918, 8th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2004: Trends in Artificial Intelligence, Auckland, New Zealand, 04/8/9.
Ishihara K, Nakatani T, Ogata T, Okuno HG. Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes. In Zhang C, Guesgen HW, Yeap WK, editors, Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). Vol. 3157. 2004. p. 909-918
Ishihara, Kazushi ; Nakatani, Tomohiro ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes. Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). editor / C. Zhang ; H.W. Guesgen ; W.K. Yeap. Vol. 3157 2004. pp. 909-918
@inproceedings{1db6c9e34a9b4d0aaa3879b5e13b56e5,
title = "Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes",
abstract = "Sound-imitation words (SIWs), or onomatopoeia, are important for computer human interactions and the automatic tagging of sound archives. The main problem in automatic SIW recognition is ambiguity in the determining phonemes, since different listener hears the same environmental sound as a different SIW even under the same situation. To solve this problem, we designed a set of new phonemes, called the basic phoneme-group set, to represent environmental sounds in addition to a set of the articulation-based phoneme-groups. Automatic SIW recognition based on Hidden Markov Model (HMM) with the basic phoneme-groups is allowed to generate plural SIWs in order to absorb ambiguities caused by listener- and situation-dependency. Listening experiments with seven subjects proved that automatic SIW recognition based on the basic phoneme-groups outperformed that based on the articulation-based phoneme-groups and that based on Japanese phonemes. The proposed system proved more adequate to use computer interactions.",
author = "Kazushi Ishihara and Tomohiro Nakatani and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2004",
language = "English",
volume = "3157",
pages = "909--918",
editor = "C. Zhang and H.W. Guesgen and W.K. Yeap",
booktitle = "Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)",

}

TY - GEN

T1 - Automatic sound-imitation word recognition from environmental sounds focusing on ambiguity problem in determining phonemes

AU - Ishihara, Kazushi

AU - Nakatani, Tomohiro

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2004

Y1 - 2004

N2 - Sound-imitation words (SIWs), or onomatopoeia, are important for computer human interactions and the automatic tagging of sound archives. The main problem in automatic SIW recognition is ambiguity in the determining phonemes, since different listener hears the same environmental sound as a different SIW even under the same situation. To solve this problem, we designed a set of new phonemes, called the basic phoneme-group set, to represent environmental sounds in addition to a set of the articulation-based phoneme-groups. Automatic SIW recognition based on Hidden Markov Model (HMM) with the basic phoneme-groups is allowed to generate plural SIWs in order to absorb ambiguities caused by listener- and situation-dependency. Listening experiments with seven subjects proved that automatic SIW recognition based on the basic phoneme-groups outperformed that based on the articulation-based phoneme-groups and that based on Japanese phonemes. The proposed system proved more adequate to use computer interactions.

AB - Sound-imitation words (SIWs), or onomatopoeia, are important for computer human interactions and the automatic tagging of sound archives. The main problem in automatic SIW recognition is ambiguity in the determining phonemes, since different listener hears the same environmental sound as a different SIW even under the same situation. To solve this problem, we designed a set of new phonemes, called the basic phoneme-group set, to represent environmental sounds in addition to a set of the articulation-based phoneme-groups. Automatic SIW recognition based on Hidden Markov Model (HMM) with the basic phoneme-groups is allowed to generate plural SIWs in order to absorb ambiguities caused by listener- and situation-dependency. Listening experiments with seven subjects proved that automatic SIW recognition based on the basic phoneme-groups outperformed that based on the articulation-based phoneme-groups and that based on Japanese phonemes. The proposed system proved more adequate to use computer interactions.

UR - http://www.scopus.com/inward/record.url?scp=22944489210&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=22944489210&partnerID=8YFLogxK

M3 - Conference contribution

VL - 3157

SP - 909

EP - 918

BT - Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)

A2 - Zhang, C.

A2 - Guesgen, H.W.

A2 - Yeap, W.K.

ER -