Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding

Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, Yutaka Kaneda, Takeshi Yamada, Takanobu Nishiura, Tetsunor Kobayashi, Shiro Ise, Hiroshi Saruwatari

研究成果: Conference contribution

8 引用 (Scopus)

抜粋

The sound data for open evaluation is necessary for studies such as sound source localization, sound retrieval, sound recognition and hands-free speech recognition in real acoustic environments. This paper reports on our project for acoustic data collection. There are many kinds of sound scenes in real environments. The sound scene is specified by sound sources and room acoustics. The number of combinations of the sound sources, source positions and rooms is huge in real acoustic environments. We assumed that the sound in the environments can be simulated by convolution of the isolated sound sources and impulse responses. As an isolated sound source, hundred kinds of environment sounds and speech sounds are collected. The impulse responses are collected in various acoustic environments. Additionally we collected sounds from a moving source. In this paper, progress of our sound scene database collection project and application to environment sound recognition and hands-free speech recognition are described.

元の言語English
ホスト出版物のタイトルProceedings - 2002 IEEE International Conference on Multimedia and Expo, ICME 2002
出版者Institute of Electrical and Electronics Engineers Inc.
ページ161-164
ページ数4
ISBN(電子版)0780373049
DOI
出版物ステータスPublished - 2002
イベント2002 IEEE International Conference on Multimedia and Expo, ICME 2002 - Lausanne, Switzerland
継続期間: 2002 8 262002 8 29

出版物シリーズ

名前Proceedings - 2002 IEEE International Conference on Multimedia and Expo, ICME 2002
2

Conference

Conference2002 IEEE International Conference on Multimedia and Expo, ICME 2002
Switzerland
Lausanne
期間02/8/2602/8/29

ASJC Scopus subject areas

  • Archaeology
  • Electrical and Electronic Engineering

フィンガープリント Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Nakamura, S., Hiyane, K., Asano, F., Kaneda, Y., Yamada, T., Nishiura, T., Kobayashi, T., Ise, S., & Saruwatari, H. (2002). Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding. : Proceedings - 2002 IEEE International Conference on Multimedia and Expo, ICME 2002 (pp. 161-164). [1035537] (Proceedings - 2002 IEEE International Conference on Multimedia and Expo, ICME 2002; 巻数 2). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICME.2002.1035537