Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist

Tatsuhiko Itohara, Takuma Otsuka, Takeshi Mizumoto, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

This paper presents an audio-visual beat-tracking method for ensemble robots with a human guitarist. Beat-tracking, or estimation of tempo and beat times of music, is critical to the high quality of musical ensemble performance. Since a human plays the guitar in out-beat in back beat and syncopation, the main problems of beat-tracking of a human's guitar playing are twofold: tempo changes and varying note lengths. Most conventional methods have not addressed human's guitar playing. Therefore, they lack the adaptation of either of the problems. To solve the problems simultaneously, our method uses not only audio but visual features. We extract audio features with Spectro-Temporal Pattern Matching (STPM) and visual features with optical flow, mean shift and Hough transform. Our beat-tracking estimates tempo and beat time using a particle filter; both acoustic feature of guitar sounds and visual features of arm motions are represented as particles. The particle is determined based on prior distribution of audio and visual features, respectively Experimental results confirm that our integrated audio-visual approach is robust against tempo changes and varying note lengths. In addition, they also show that estimation convergence rate depends only a little on the number of particles. The real-time factor is 0.88 when the number of particles is 200, and this shows out method works in real-time.

Original languageEnglish
Title of host publicationIEEE International Conference on Intelligent Robots and Systems
Pages118-124
Number of pages7
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event2011 IEEE/RSJ International Conference on Intelligent Robots and Systems: Celebrating 50 Years of Robotics, IROS'11 - San Francisco, CA
Duration: 2011 Sep 252011 Sep 30

Other

Other2011 IEEE/RSJ International Conference on Intelligent Robots and Systems: Celebrating 50 Years of Robotics, IROS'11
CitySan Francisco, CA
Period11/9/2511/9/30

Fingerprint

Robots
Hough transforms
Optical flows
Pattern matching
Acoustics
Acoustic waves

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Cite this

Itohara, T., Otsuka, T., Mizumoto, T., Ogata, T., & Okuno, H. G. (2011). Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist. In IEEE International Conference on Intelligent Robots and Systems (pp. 118-124). [6048380] https://doi.org/10.1109/IROS.2011.6048380

Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist. / Itohara, Tatsuhiko; Otsuka, Takuma; Mizumoto, Takeshi; Ogata, Tetsuya; Okuno, Hiroshi G.

IEEE International Conference on Intelligent Robots and Systems. 2011. p. 118-124 6048380.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Itohara, T, Otsuka, T, Mizumoto, T, Ogata, T & Okuno, HG 2011, Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist. in IEEE International Conference on Intelligent Robots and Systems., 6048380, pp. 118-124, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems: Celebrating 50 Years of Robotics, IROS'11, San Francisco, CA, 11/9/25. https://doi.org/10.1109/IROS.2011.6048380
Itohara T, Otsuka T, Mizumoto T, Ogata T, Okuno HG. Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist. In IEEE International Conference on Intelligent Robots and Systems. 2011. p. 118-124. 6048380 https://doi.org/10.1109/IROS.2011.6048380
Itohara, Tatsuhiko ; Otsuka, Takuma ; Mizumoto, Takeshi ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist. IEEE International Conference on Intelligent Robots and Systems. 2011. pp. 118-124
@inproceedings{6d3f798766314b5da91412e96dae68b2,
title = "Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist",
abstract = "This paper presents an audio-visual beat-tracking method for ensemble robots with a human guitarist. Beat-tracking, or estimation of tempo and beat times of music, is critical to the high quality of musical ensemble performance. Since a human plays the guitar in out-beat in back beat and syncopation, the main problems of beat-tracking of a human's guitar playing are twofold: tempo changes and varying note lengths. Most conventional methods have not addressed human's guitar playing. Therefore, they lack the adaptation of either of the problems. To solve the problems simultaneously, our method uses not only audio but visual features. We extract audio features with Spectro-Temporal Pattern Matching (STPM) and visual features with optical flow, mean shift and Hough transform. Our beat-tracking estimates tempo and beat time using a particle filter; both acoustic feature of guitar sounds and visual features of arm motions are represented as particles. The particle is determined based on prior distribution of audio and visual features, respectively Experimental results confirm that our integrated audio-visual approach is robust against tempo changes and varying note lengths. In addition, they also show that estimation convergence rate depends only a little on the number of particles. The real-time factor is 0.88 when the number of particles is 200, and this shows out method works in real-time.",
author = "Tatsuhiko Itohara and Takuma Otsuka and Takeshi Mizumoto and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2011",
doi = "10.1109/IROS.2011.6048380",
language = "English",
isbn = "9781612844541",
pages = "118--124",
booktitle = "IEEE International Conference on Intelligent Robots and Systems",

}

TY - GEN

T1 - Particle-filter based audio-visual beat-tracking for music robot ensemble with human guitarist

AU - Itohara, Tatsuhiko

AU - Otsuka, Takuma

AU - Mizumoto, Takeshi

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2011

Y1 - 2011

N2 - This paper presents an audio-visual beat-tracking method for ensemble robots with a human guitarist. Beat-tracking, or estimation of tempo and beat times of music, is critical to the high quality of musical ensemble performance. Since a human plays the guitar in out-beat in back beat and syncopation, the main problems of beat-tracking of a human's guitar playing are twofold: tempo changes and varying note lengths. Most conventional methods have not addressed human's guitar playing. Therefore, they lack the adaptation of either of the problems. To solve the problems simultaneously, our method uses not only audio but visual features. We extract audio features with Spectro-Temporal Pattern Matching (STPM) and visual features with optical flow, mean shift and Hough transform. Our beat-tracking estimates tempo and beat time using a particle filter; both acoustic feature of guitar sounds and visual features of arm motions are represented as particles. The particle is determined based on prior distribution of audio and visual features, respectively Experimental results confirm that our integrated audio-visual approach is robust against tempo changes and varying note lengths. In addition, they also show that estimation convergence rate depends only a little on the number of particles. The real-time factor is 0.88 when the number of particles is 200, and this shows out method works in real-time.

AB - This paper presents an audio-visual beat-tracking method for ensemble robots with a human guitarist. Beat-tracking, or estimation of tempo and beat times of music, is critical to the high quality of musical ensemble performance. Since a human plays the guitar in out-beat in back beat and syncopation, the main problems of beat-tracking of a human's guitar playing are twofold: tempo changes and varying note lengths. Most conventional methods have not addressed human's guitar playing. Therefore, they lack the adaptation of either of the problems. To solve the problems simultaneously, our method uses not only audio but visual features. We extract audio features with Spectro-Temporal Pattern Matching (STPM) and visual features with optical flow, mean shift and Hough transform. Our beat-tracking estimates tempo and beat time using a particle filter; both acoustic feature of guitar sounds and visual features of arm motions are represented as particles. The particle is determined based on prior distribution of audio and visual features, respectively Experimental results confirm that our integrated audio-visual approach is robust against tempo changes and varying note lengths. In addition, they also show that estimation convergence rate depends only a little on the number of particles. The real-time factor is 0.88 when the number of particles is 200, and this shows out method works in real-time.

UR - http://www.scopus.com/inward/record.url?scp=84455179814&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84455179814&partnerID=8YFLogxK

U2 - 10.1109/IROS.2011.6048380

DO - 10.1109/IROS.2011.6048380

M3 - Conference contribution

SN - 9781612844541

SP - 118

EP - 124

BT - IEEE International Conference on Intelligent Robots and Systems

ER -