A multimodal tempo and beat-tracking system based on audiovisual information from live guitar performances

Tatsuhiko Itohara, Takuma Otsuka, Takeshi Mizumoto, Angelica Lim, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The aim of this paper is to improve beat-tracking for live guitar performances. Beat-tracking is a function to estimate musical measurements, for example musical tempo and phase. This method is critical to achieve a synchronized ensemble performance such as musical robot accompaniment. Beat-tracking of a live guitar performance has to deal with three challenges: tempo fluctuation, beat pattern complexity and environmenta noise. To cope with these problems, we devise an audiovisual integration method for beat-tracking. The auditory beat features are estimated in terms of tactus (phase) and tempo (period) by Spectro-Temporal Pattern Matching (STPM), robust against stationary noise. The visual beat features are estimated by tracking the position of the hand relative to the guitar using optical flow, mean shift and the Hough transform. Both estimated features are integrated using a particle filter to aggregate the multimodal information based on a beat location model and a hand's trajectory model. Experimental results confirm that our beat-tracking improves the F-measure by 8.9 points on average over the Murata beat-tracking method, which uses STPM and rule-based beat detection. The results also show that the system is capable of real-time processing with a suppressed number of particles while preserving the estimation accuracy. We demonstrate an ensemble with the humanoid HRP-2 that plays the theremin with a human guitarist.

Original languageEnglish
Article number6
JournalEurasip Journal on Audio, Speech, and Music Processing
Volume2012
Issue number1
DOIs
Publication statusPublished - 2012
Externally publishedYes

Fingerprint

Pattern matching
synchronism
Hough transforms
Optical flows
Trajectories
Robots
Processing
robots
preserving
trajectories
filters

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

A multimodal tempo and beat-tracking system based on audiovisual information from live guitar performances. / Itohara, Tatsuhiko; Otsuka, Takuma; Mizumoto, Takeshi; Lim, Angelica; Ogata, Tetsuya; Okuno, Hiroshi G.

In: Eurasip Journal on Audio, Speech, and Music Processing, Vol. 2012, No. 1, 6, 2012.

Research output: Contribution to journalArticle

@article{f4efdba3fe38404ea3d9b8be7013ae37,
title = "A multimodal tempo and beat-tracking system based on audiovisual information from live guitar performances",
abstract = "The aim of this paper is to improve beat-tracking for live guitar performances. Beat-tracking is a function to estimate musical measurements, for example musical tempo and phase. This method is critical to achieve a synchronized ensemble performance such as musical robot accompaniment. Beat-tracking of a live guitar performance has to deal with three challenges: tempo fluctuation, beat pattern complexity and environmenta noise. To cope with these problems, we devise an audiovisual integration method for beat-tracking. The auditory beat features are estimated in terms of tactus (phase) and tempo (period) by Spectro-Temporal Pattern Matching (STPM), robust against stationary noise. The visual beat features are estimated by tracking the position of the hand relative to the guitar using optical flow, mean shift and the Hough transform. Both estimated features are integrated using a particle filter to aggregate the multimodal information based on a beat location model and a hand's trajectory model. Experimental results confirm that our beat-tracking improves the F-measure by 8.9 points on average over the Murata beat-tracking method, which uses STPM and rule-based beat detection. The results also show that the system is capable of real-time processing with a suppressed number of particles while preserving the estimation accuracy. We demonstrate an ensemble with the humanoid HRP-2 that plays the theremin with a human guitarist.",
author = "Tatsuhiko Itohara and Takuma Otsuka and Takeshi Mizumoto and Angelica Lim and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2012",
doi = "10.1186/1687-4722-2012-6",
language = "English",
volume = "2012",
journal = "Eurasip Journal on Audio, Speech, and Music Processing",
issn = "1687-4714",
publisher = "Springer Publishing Company",
number = "1",

}

TY - JOUR

T1 - A multimodal tempo and beat-tracking system based on audiovisual information from live guitar performances

AU - Itohara, Tatsuhiko

AU - Otsuka, Takuma

AU - Mizumoto, Takeshi

AU - Lim, Angelica

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2012

Y1 - 2012

N2 - The aim of this paper is to improve beat-tracking for live guitar performances. Beat-tracking is a function to estimate musical measurements, for example musical tempo and phase. This method is critical to achieve a synchronized ensemble performance such as musical robot accompaniment. Beat-tracking of a live guitar performance has to deal with three challenges: tempo fluctuation, beat pattern complexity and environmenta noise. To cope with these problems, we devise an audiovisual integration method for beat-tracking. The auditory beat features are estimated in terms of tactus (phase) and tempo (period) by Spectro-Temporal Pattern Matching (STPM), robust against stationary noise. The visual beat features are estimated by tracking the position of the hand relative to the guitar using optical flow, mean shift and the Hough transform. Both estimated features are integrated using a particle filter to aggregate the multimodal information based on a beat location model and a hand's trajectory model. Experimental results confirm that our beat-tracking improves the F-measure by 8.9 points on average over the Murata beat-tracking method, which uses STPM and rule-based beat detection. The results also show that the system is capable of real-time processing with a suppressed number of particles while preserving the estimation accuracy. We demonstrate an ensemble with the humanoid HRP-2 that plays the theremin with a human guitarist.

AB - The aim of this paper is to improve beat-tracking for live guitar performances. Beat-tracking is a function to estimate musical measurements, for example musical tempo and phase. This method is critical to achieve a synchronized ensemble performance such as musical robot accompaniment. Beat-tracking of a live guitar performance has to deal with three challenges: tempo fluctuation, beat pattern complexity and environmenta noise. To cope with these problems, we devise an audiovisual integration method for beat-tracking. The auditory beat features are estimated in terms of tactus (phase) and tempo (period) by Spectro-Temporal Pattern Matching (STPM), robust against stationary noise. The visual beat features are estimated by tracking the position of the hand relative to the guitar using optical flow, mean shift and the Hough transform. Both estimated features are integrated using a particle filter to aggregate the multimodal information based on a beat location model and a hand's trajectory model. Experimental results confirm that our beat-tracking improves the F-measure by 8.9 points on average over the Murata beat-tracking method, which uses STPM and rule-based beat detection. The results also show that the system is capable of real-time processing with a suppressed number of particles while preserving the estimation accuracy. We demonstrate an ensemble with the humanoid HRP-2 that plays the theremin with a human guitarist.

UR - http://www.scopus.com/inward/record.url?scp=84872293111&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872293111&partnerID=8YFLogxK

U2 - 10.1186/1687-4722-2012-6

DO - 10.1186/1687-4722-2012-6

M3 - Article

VL - 2012

JO - Eurasip Journal on Audio, Speech, and Music Processing

JF - Eurasip Journal on Audio, Speech, and Music Processing

SN - 1687-4714

IS - 1

M1 - 6

ER -