Computational Auditory Scene Analysis and Its Application to Robot Audition

Hiroshi G. Okuno, Tetsuya Ogata, Kazunori Komatani, Kazuhiro Nakadai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

We are engaged in research on computational auditory scene analysis to attain sophisticated robot (computer) human interaction by recognizing auditory awareness. The objective of our research is the understanding of an arbitrary sound mixture including non-speech sounds and music as well as voiced speech, obtained by robot's ears (or microphones embedded in the robot). The main issues are sound source localization, separation, and recognition at signal processing levels, and signal-to-symbol transformation at the interface level to symbol processing levels. The latter is critical in developmental communication and we are developing an automatic onomatopoeia recognition system. This paper overviews our activities in robot audition, in particular, active direction-pass filter (ADPF) that separates sounds originating from a specific direction by integrating sound source localization and visual processing. ADPF is implemented on three kinds of robots and demonstrates separating and recognizing three simultaneous speeches with a pair of microphones.

Original languageEnglish
Title of host publicationProceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004
EditorsT. Ibaraki, T. Inui, K. Tanaka
Pages73-80
Number of pages8
DOIs
Publication statusPublished - 2004
Externally publishedYes
EventProceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004 - Kyoto
Duration: 2004 Mar 12004 Mar 2

Other

OtherProceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004
CityKyoto
Period04/3/104/3/2

Fingerprint

Audition
Acoustic waves
Robots
Microphones
Source separation
Human computer interaction
Processing
Signal processing
Communication

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Okuno, H. G., Ogata, T., Komatani, K., & Nakadai, K. (2004). Computational Auditory Scene Analysis and Its Application to Robot Audition. In T. Ibaraki, T. Inui, & K. Tanaka (Eds.), Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004 (pp. 73-80) https://doi.org/10.1109/ICKS.2004.1313411

Computational Auditory Scene Analysis and Its Application to Robot Audition. / Okuno, Hiroshi G.; Ogata, Tetsuya; Komatani, Kazunori; Nakadai, Kazuhiro.

Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004. ed. / T. Ibaraki; T. Inui; K. Tanaka. 2004. p. 73-80.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Okuno, HG, Ogata, T, Komatani, K & Nakadai, K 2004, Computational Auditory Scene Analysis and Its Application to Robot Audition. in T Ibaraki, T Inui & K Tanaka (eds), Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004. pp. 73-80, Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004, Kyoto, 04/3/1. https://doi.org/10.1109/ICKS.2004.1313411
Okuno HG, Ogata T, Komatani K, Nakadai K. Computational Auditory Scene Analysis and Its Application to Robot Audition. In Ibaraki T, Inui T, Tanaka K, editors, Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004. 2004. p. 73-80 https://doi.org/10.1109/ICKS.2004.1313411
Okuno, Hiroshi G. ; Ogata, Tetsuya ; Komatani, Kazunori ; Nakadai, Kazuhiro. / Computational Auditory Scene Analysis and Its Application to Robot Audition. Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004. editor / T. Ibaraki ; T. Inui ; K. Tanaka. 2004. pp. 73-80
@inproceedings{b18f6978f1e84e13837dd3bed916389e,
title = "Computational Auditory Scene Analysis and Its Application to Robot Audition",
abstract = "We are engaged in research on computational auditory scene analysis to attain sophisticated robot (computer) human interaction by recognizing auditory awareness. The objective of our research is the understanding of an arbitrary sound mixture including non-speech sounds and music as well as voiced speech, obtained by robot's ears (or microphones embedded in the robot). The main issues are sound source localization, separation, and recognition at signal processing levels, and signal-to-symbol transformation at the interface level to symbol processing levels. The latter is critical in developmental communication and we are developing an automatic onomatopoeia recognition system. This paper overviews our activities in robot audition, in particular, active direction-pass filter (ADPF) that separates sounds originating from a specific direction by integrating sound source localization and visual processing. ADPF is implemented on three kinds of robots and demonstrates separating and recognizing three simultaneous speeches with a pair of microphones.",
author = "Okuno, {Hiroshi G.} and Tetsuya Ogata and Kazunori Komatani and Kazuhiro Nakadai",
year = "2004",
doi = "10.1109/ICKS.2004.1313411",
language = "English",
isbn = "0769521509",
pages = "73--80",
editor = "T. Ibaraki and T. Inui and K. Tanaka",
booktitle = "Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004",

}

TY - GEN

T1 - Computational Auditory Scene Analysis and Its Application to Robot Audition

AU - Okuno, Hiroshi G.

AU - Ogata, Tetsuya

AU - Komatani, Kazunori

AU - Nakadai, Kazuhiro

PY - 2004

Y1 - 2004

N2 - We are engaged in research on computational auditory scene analysis to attain sophisticated robot (computer) human interaction by recognizing auditory awareness. The objective of our research is the understanding of an arbitrary sound mixture including non-speech sounds and music as well as voiced speech, obtained by robot's ears (or microphones embedded in the robot). The main issues are sound source localization, separation, and recognition at signal processing levels, and signal-to-symbol transformation at the interface level to symbol processing levels. The latter is critical in developmental communication and we are developing an automatic onomatopoeia recognition system. This paper overviews our activities in robot audition, in particular, active direction-pass filter (ADPF) that separates sounds originating from a specific direction by integrating sound source localization and visual processing. ADPF is implemented on three kinds of robots and demonstrates separating and recognizing three simultaneous speeches with a pair of microphones.

AB - We are engaged in research on computational auditory scene analysis to attain sophisticated robot (computer) human interaction by recognizing auditory awareness. The objective of our research is the understanding of an arbitrary sound mixture including non-speech sounds and music as well as voiced speech, obtained by robot's ears (or microphones embedded in the robot). The main issues are sound source localization, separation, and recognition at signal processing levels, and signal-to-symbol transformation at the interface level to symbol processing levels. The latter is critical in developmental communication and we are developing an automatic onomatopoeia recognition system. This paper overviews our activities in robot audition, in particular, active direction-pass filter (ADPF) that separates sounds originating from a specific direction by integrating sound source localization and visual processing. ADPF is implemented on three kinds of robots and demonstrates separating and recognizing three simultaneous speeches with a pair of microphones.

UR - http://www.scopus.com/inward/record.url?scp=10444249505&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=10444249505&partnerID=8YFLogxK

U2 - 10.1109/ICKS.2004.1313411

DO - 10.1109/ICKS.2004.1313411

M3 - Conference contribution

AN - SCOPUS:10444249505

SN - 0769521509

SN - 9780769521503

SP - 73

EP - 80

BT - Proceedings - International Conference on Informatics Research for Development of Knowledge Society Infrastructure, ICKS 2004

A2 - Ibaraki, T.

A2 - Inui, T.

A2 - Tanaka, K.

ER -