Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software

Osamu Sugiyama, Katsutoshi Itoyama, Kazuhiro Nakada, Hiroshi G. Okuno

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Since the multidirectional sound annotation requires simultaneous labeling of separated sound sources and a multidirectional overview of the sources, it is essential to have an efficient way of annotation and an intuitive view of multidirectional sounds. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user's annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.

    Original languageEnglish
    Title of host publicationConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages2335-2340
    Number of pages6
    Volume2014-January
    EditionJanuary
    DOIs
    Publication statusPublished - 2014
    Event2014 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2014 - San Diego, United States
    Duration: 2014 Oct 52014 Oct 8

    Other

    Other2014 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2014
    CountryUnited States
    CitySan Diego
    Period14/10/514/10/8

    Fingerprint

    Audition
    Acoustic waves
    Robots
    Microphones
    Labeling
    Drag
    Semantics

    Keywords

    • Audio annotation
    • Media computing
    • User interface design

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Control and Systems Engineering
    • Human-Computer Interaction

    Cite this

    Sugiyama, O., Itoyama, K., Nakada, K., & Okuno, H. G. (2014). Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software. In Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics (January ed., Vol. 2014-January, pp. 2335-2340). [6974275] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/smc.2014.6974275

    Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software. / Sugiyama, Osamu; Itoyama, Katsutoshi; Nakada, Kazuhiro; Okuno, Hiroshi G.

    Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics. Vol. 2014-January January. ed. Institute of Electrical and Electronics Engineers Inc., 2014. p. 2335-2340 6974275.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Sugiyama, O, Itoyama, K, Nakada, K & Okuno, HG 2014, Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software. in Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics. January edn, vol. 2014-January, 6974275, Institute of Electrical and Electronics Engineers Inc., pp. 2335-2340, 2014 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2014, San Diego, United States, 14/10/5. https://doi.org/10.1109/smc.2014.6974275
    Sugiyama O, Itoyama K, Nakada K, Okuno HG. Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software. In Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics. January ed. Vol. 2014-January. Institute of Electrical and Electronics Engineers Inc. 2014. p. 2335-2340. 6974275 https://doi.org/10.1109/smc.2014.6974275
    Sugiyama, Osamu ; Itoyama, Katsutoshi ; Nakada, Kazuhiro ; Okuno, Hiroshi G. / Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software. Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics. Vol. 2014-January January. ed. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 2335-2340
    @inproceedings{e3f8068a3a124c91bc868963ee8d9a8d,
    title = "Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software",
    abstract = "With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Since the multidirectional sound annotation requires simultaneous labeling of separated sound sources and a multidirectional overview of the sources, it is essential to have an efficient way of annotation and an intuitive view of multidirectional sounds. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user's annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.",
    keywords = "Audio annotation, Media computing, User interface design",
    author = "Osamu Sugiyama and Katsutoshi Itoyama and Kazuhiro Nakada and Okuno, {Hiroshi G.}",
    year = "2014",
    doi = "10.1109/smc.2014.6974275",
    language = "English",
    volume = "2014-January",
    pages = "2335--2340",
    booktitle = "Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",
    edition = "January",

    }

    TY - GEN

    T1 - Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software

    AU - Sugiyama, Osamu

    AU - Itoyama, Katsutoshi

    AU - Nakada, Kazuhiro

    AU - Okuno, Hiroshi G.

    PY - 2014

    Y1 - 2014

    N2 - With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Since the multidirectional sound annotation requires simultaneous labeling of separated sound sources and a multidirectional overview of the sources, it is essential to have an efficient way of annotation and an intuitive view of multidirectional sounds. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user's annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.

    AB - With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Since the multidirectional sound annotation requires simultaneous labeling of separated sound sources and a multidirectional overview of the sources, it is essential to have an efficient way of annotation and an intuitive view of multidirectional sounds. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user's annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.

    KW - Audio annotation

    KW - Media computing

    KW - User interface design

    UR - http://www.scopus.com/inward/record.url?scp=84938077680&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84938077680&partnerID=8YFLogxK

    U2 - 10.1109/smc.2014.6974275

    DO - 10.1109/smc.2014.6974275

    M3 - Conference contribution

    AN - SCOPUS:84938077680

    VL - 2014-January

    SP - 2335

    EP - 2340

    BT - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -