Real-time audio-to-score alignment using particle filter for coplayer music robots

Takuma Otsuka, Kazuhiro Nakadai, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

研究成果: Article

22 引用 (Scopus)

抄録

Our goal is to develop a coplayer music robot capable of presenting a musical expression together with humans. Although many instrument-performing robots exist, they may have difficulty playing with human performers due to the lack of the synchronization function. The robot has to follow differences in humans' performance such as temporal fluctuations to play with human performers. We classify synchronization and musical expression into two levels: (1) melody level and (2) rhythm level to cope with erroneous synchronizations. The idea is as follows: When the synchronization with the melody is reliable, respond to the pitch the robot hears, when the synchronization is uncertain, try to follow the rhythm of the music. Our method estimates the score position for the melody level and the tempo for the rhythm level. The reliability of the score position estimation is extracted from the probability distribution of the score position. The experimental results demonstrate that our method outperforms the existing score following system in 16 songs out of 20 polyphonic songs. The error in the prediction of the score position is reduced by 69 on average. The results also revealed that the switching mechanism alleviates the error in the estimation of the score position.

元の言語English
記事番号384651
ジャーナルEurasip Journal on Advances in Signal Processing
2011
DOI
出版物ステータスPublished - 2011
外部発表Yes

Fingerprint

Synchronization
Robots
Probability distributions

ASJC Scopus subject areas

  • Hardware and Architecture
  • Signal Processing
  • Electrical and Electronic Engineering

これを引用

@article{17b06fa87f7d471aae594309b45e6240,
title = "Real-time audio-to-score alignment using particle filter for coplayer music robots",
abstract = "Our goal is to develop a coplayer music robot capable of presenting a musical expression together with humans. Although many instrument-performing robots exist, they may have difficulty playing with human performers due to the lack of the synchronization function. The robot has to follow differences in humans' performance such as temporal fluctuations to play with human performers. We classify synchronization and musical expression into two levels: (1) melody level and (2) rhythm level to cope with erroneous synchronizations. The idea is as follows: When the synchronization with the melody is reliable, respond to the pitch the robot hears, when the synchronization is uncertain, try to follow the rhythm of the music. Our method estimates the score position for the melody level and the tempo for the rhythm level. The reliability of the score position estimation is extracted from the probability distribution of the score position. The experimental results demonstrate that our method outperforms the existing score following system in 16 songs out of 20 polyphonic songs. The error in the prediction of the score position is reduced by 69 on average. The results also revealed that the switching mechanism alleviates the error in the estimation of the score position.",
author = "Takuma Otsuka and Kazuhiro Nakadai and Toru Takahashi and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2011",
doi = "10.1155/2011/384651",
language = "English",
volume = "2011",
journal = "Eurasip Journal on Advances in Signal Processing",
issn = "1687-6172",
publisher = "Hindawi Publishing Corporation",

}

TY - JOUR

T1 - Real-time audio-to-score alignment using particle filter for coplayer music robots

AU - Otsuka, Takuma

AU - Nakadai, Kazuhiro

AU - Takahashi, Toru

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2011

Y1 - 2011

N2 - Our goal is to develop a coplayer music robot capable of presenting a musical expression together with humans. Although many instrument-performing robots exist, they may have difficulty playing with human performers due to the lack of the synchronization function. The robot has to follow differences in humans' performance such as temporal fluctuations to play with human performers. We classify synchronization and musical expression into two levels: (1) melody level and (2) rhythm level to cope with erroneous synchronizations. The idea is as follows: When the synchronization with the melody is reliable, respond to the pitch the robot hears, when the synchronization is uncertain, try to follow the rhythm of the music. Our method estimates the score position for the melody level and the tempo for the rhythm level. The reliability of the score position estimation is extracted from the probability distribution of the score position. The experimental results demonstrate that our method outperforms the existing score following system in 16 songs out of 20 polyphonic songs. The error in the prediction of the score position is reduced by 69 on average. The results also revealed that the switching mechanism alleviates the error in the estimation of the score position.

AB - Our goal is to develop a coplayer music robot capable of presenting a musical expression together with humans. Although many instrument-performing robots exist, they may have difficulty playing with human performers due to the lack of the synchronization function. The robot has to follow differences in humans' performance such as temporal fluctuations to play with human performers. We classify synchronization and musical expression into two levels: (1) melody level and (2) rhythm level to cope with erroneous synchronizations. The idea is as follows: When the synchronization with the melody is reliable, respond to the pitch the robot hears, when the synchronization is uncertain, try to follow the rhythm of the music. Our method estimates the score position for the melody level and the tempo for the rhythm level. The reliability of the score position estimation is extracted from the probability distribution of the score position. The experimental results demonstrate that our method outperforms the existing score following system in 16 songs out of 20 polyphonic songs. The error in the prediction of the score position is reduced by 69 on average. The results also revealed that the switching mechanism alleviates the error in the estimation of the score position.

UR - http://www.scopus.com/inward/record.url?scp=79952119269&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952119269&partnerID=8YFLogxK

U2 - 10.1155/2011/384651

DO - 10.1155/2011/384651

M3 - Article

AN - SCOPUS:79952119269

VL - 2011

JO - Eurasip Journal on Advances in Signal Processing

JF - Eurasip Journal on Advances in Signal Processing

SN - 1687-6172

M1 - 384651

ER -