Design and implementation of two-level synchronization for an interactive music robot

Takuma Otsuka, Kazuhiro Nakadai, Tom Takahashi, Kazunori Komatanj, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Our goal is to develop an interactive music robot, i.e., a robot that presents a musical expression together with humans. A music interaction requires two important functions: synchronization with the music and musical expression, such as singing and dancing. Many instrument-performing robots are only capable of the latter function, they may have difficulty in playing live with human performers. The synchronization function is critical for the interaction. We classify synchronization and musical expression into two levels: (1) the rhythm level and (2) the melody level. Two issues in achieving two-layer synchronization and musical expression are: (1) simultaneous estimation of the rhythm structure and the current part of the music and (2) derivation of the estimation confidence to switch behavior between the rhythm level and the melody level. This paper presents a score following algorithm, incremental audio to score alignment, that conforms to the two-level synchronization design using a particle filter. Our method estimates the score position for the melody level and the tempo for the rhythm level. The reliability of the score position estimation is extracted from the probability distribution of the score position. Experiments are carried out using polyphonic jazz songs. The results confirm that our method switches levels in accordance with the difficulty of the score estimation. When the tempo of the music is less than 120 (beats per minute; bpm), the estimated score positions are accurate and reported; when the tempo is over 120 (bpm), the system tends to report only the tempo to suppress the error in the reported score position predictions.

Original languageEnglish
Title of host publicationProceedings of the National Conference on Artificial Intelligence
Pages1238-1244
Number of pages7
Volume2
Publication statusPublished - 2010
Externally publishedYes
Event24th AAAI Conference on Artificial Intelligence and the 22nd Innovative Applications of Artificial Intelligence Conference, AAAI-10 / IAAI-10 - Atlanta, GA
Duration: 2010 Jul 112010 Jul 15

Other

Other24th AAAI Conference on Artificial Intelligence and the 22nd Innovative Applications of Artificial Intelligence Conference, AAAI-10 / IAAI-10
CityAtlanta, GA
Period10/7/1110/7/15

Fingerprint

Synchronization
Robots
Switches
Probability distributions
Experiments

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Cite this

Otsuka, T., Nakadai, K., Takahashi, T., Komatanj, K., Ogata, T., & Okuno, H. G. (2010). Design and implementation of two-level synchronization for an interactive music robot. In Proceedings of the National Conference on Artificial Intelligence (Vol. 2, pp. 1238-1244)

Design and implementation of two-level synchronization for an interactive music robot. / Otsuka, Takuma; Nakadai, Kazuhiro; Takahashi, Tom; Komatanj, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

Proceedings of the National Conference on Artificial Intelligence. Vol. 2 2010. p. 1238-1244.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Otsuka, T, Nakadai, K, Takahashi, T, Komatanj, K, Ogata, T & Okuno, HG 2010, Design and implementation of two-level synchronization for an interactive music robot. in Proceedings of the National Conference on Artificial Intelligence. vol. 2, pp. 1238-1244, 24th AAAI Conference on Artificial Intelligence and the 22nd Innovative Applications of Artificial Intelligence Conference, AAAI-10 / IAAI-10, Atlanta, GA, 10/7/11.
Otsuka T, Nakadai K, Takahashi T, Komatanj K, Ogata T, Okuno HG. Design and implementation of two-level synchronization for an interactive music robot. In Proceedings of the National Conference on Artificial Intelligence. Vol. 2. 2010. p. 1238-1244
Otsuka, Takuma ; Nakadai, Kazuhiro ; Takahashi, Tom ; Komatanj, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Design and implementation of two-level synchronization for an interactive music robot. Proceedings of the National Conference on Artificial Intelligence. Vol. 2 2010. pp. 1238-1244
@inproceedings{bfb7771ede5d478d9e1c8e8936dd671a,
title = "Design and implementation of two-level synchronization for an interactive music robot",
abstract = "Our goal is to develop an interactive music robot, i.e., a robot that presents a musical expression together with humans. A music interaction requires two important functions: synchronization with the music and musical expression, such as singing and dancing. Many instrument-performing robots are only capable of the latter function, they may have difficulty in playing live with human performers. The synchronization function is critical for the interaction. We classify synchronization and musical expression into two levels: (1) the rhythm level and (2) the melody level. Two issues in achieving two-layer synchronization and musical expression are: (1) simultaneous estimation of the rhythm structure and the current part of the music and (2) derivation of the estimation confidence to switch behavior between the rhythm level and the melody level. This paper presents a score following algorithm, incremental audio to score alignment, that conforms to the two-level synchronization design using a particle filter. Our method estimates the score position for the melody level and the tempo for the rhythm level. The reliability of the score position estimation is extracted from the probability distribution of the score position. Experiments are carried out using polyphonic jazz songs. The results confirm that our method switches levels in accordance with the difficulty of the score estimation. When the tempo of the music is less than 120 (beats per minute; bpm), the estimated score positions are accurate and reported; when the tempo is over 120 (bpm), the system tends to report only the tempo to suppress the error in the reported score position predictions.",
author = "Takuma Otsuka and Kazuhiro Nakadai and Tom Takahashi and Kazunori Komatanj and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2010",
language = "English",
isbn = "9781577354659",
volume = "2",
pages = "1238--1244",
booktitle = "Proceedings of the National Conference on Artificial Intelligence",

}

TY - GEN

T1 - Design and implementation of two-level synchronization for an interactive music robot

AU - Otsuka, Takuma

AU - Nakadai, Kazuhiro

AU - Takahashi, Tom

AU - Komatanj, Kazunori

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2010

Y1 - 2010

N2 - Our goal is to develop an interactive music robot, i.e., a robot that presents a musical expression together with humans. A music interaction requires two important functions: synchronization with the music and musical expression, such as singing and dancing. Many instrument-performing robots are only capable of the latter function, they may have difficulty in playing live with human performers. The synchronization function is critical for the interaction. We classify synchronization and musical expression into two levels: (1) the rhythm level and (2) the melody level. Two issues in achieving two-layer synchronization and musical expression are: (1) simultaneous estimation of the rhythm structure and the current part of the music and (2) derivation of the estimation confidence to switch behavior between the rhythm level and the melody level. This paper presents a score following algorithm, incremental audio to score alignment, that conforms to the two-level synchronization design using a particle filter. Our method estimates the score position for the melody level and the tempo for the rhythm level. The reliability of the score position estimation is extracted from the probability distribution of the score position. Experiments are carried out using polyphonic jazz songs. The results confirm that our method switches levels in accordance with the difficulty of the score estimation. When the tempo of the music is less than 120 (beats per minute; bpm), the estimated score positions are accurate and reported; when the tempo is over 120 (bpm), the system tends to report only the tempo to suppress the error in the reported score position predictions.

AB - Our goal is to develop an interactive music robot, i.e., a robot that presents a musical expression together with humans. A music interaction requires two important functions: synchronization with the music and musical expression, such as singing and dancing. Many instrument-performing robots are only capable of the latter function, they may have difficulty in playing live with human performers. The synchronization function is critical for the interaction. We classify synchronization and musical expression into two levels: (1) the rhythm level and (2) the melody level. Two issues in achieving two-layer synchronization and musical expression are: (1) simultaneous estimation of the rhythm structure and the current part of the music and (2) derivation of the estimation confidence to switch behavior between the rhythm level and the melody level. This paper presents a score following algorithm, incremental audio to score alignment, that conforms to the two-level synchronization design using a particle filter. Our method estimates the score position for the melody level and the tempo for the rhythm level. The reliability of the score position estimation is extracted from the probability distribution of the score position. Experiments are carried out using polyphonic jazz songs. The results confirm that our method switches levels in accordance with the difficulty of the score estimation. When the tempo of the music is less than 120 (beats per minute; bpm), the estimated score positions are accurate and reported; when the tempo is over 120 (bpm), the system tends to report only the tempo to suppress the error in the reported score position predictions.

UR - http://www.scopus.com/inward/record.url?scp=77958581833&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77958581833&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:77958581833

SN - 9781577354659

VL - 2

SP - 1238

EP - 1244

BT - Proceedings of the National Conference on Artificial Intelligence

ER -