Incremental bayesian audio-to-score alignment with flexible harmonic structure models

Takuma Otsuka, Kazuhiro Nakadai, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Music information retrieval, especially the audio-to-score alignment problem, often involves a matching problem between the audio and symbolic representations. We must cope with uncertainty in the audio signal generated from the score in a symbolic representation such as the variation in the timbre or temporal fluctuations. Existing audio-to-score alignment methods are sometimes vulnerable to the uncertainty in which multiple notes are simultaneously played with a variety of timbres because these methods rely on static observation models. For example, a chroma vector or a fixed harmonic structure template is used under the assumption that musical notes in a chord are all in the same volume and timbre. This paper presents a particle filterbased audio-to-score alignment method with a flexible observation model based on latent harmonic allocation. Our method adapts to the harmonic structure for the audio-toscore matching based on the observation of the audio signal through Bayesian inference. Experimental results with 20 polyphonic songs reveal that our method is effective when more number of instruments are involved in the ensemble.

Original languageEnglish
Title of host publicationProceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011
Pages525-530
Number of pages6
Publication statusPublished - 2011
Externally publishedYes
Event12th International Society for Music Information Retrieval Conference, ISMIR 2011 - Miami, FL
Duration: 2011 Oct 242011 Oct 28

Other

Other12th International Society for Music Information Retrieval Conference, ISMIR 2011
CityMiami, FL
Period11/10/2411/10/28

Fingerprint

Model structures
Information retrieval
Harmonics
Alignment
Uncertainty
Timbre

ASJC Scopus subject areas

  • Music
  • Information Systems

Cite this

Otsuka, T., Nakadai, K., Ogata, T., & Okuno, H. G. (2011). Incremental bayesian audio-to-score alignment with flexible harmonic structure models. In Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011 (pp. 525-530)

Incremental bayesian audio-to-score alignment with flexible harmonic structure models. / Otsuka, Takuma; Nakadai, Kazuhiro; Ogata, Tetsuya; Okuno, Hiroshi G.

Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011. 2011. p. 525-530.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Otsuka, T, Nakadai, K, Ogata, T & Okuno, HG 2011, Incremental bayesian audio-to-score alignment with flexible harmonic structure models. in Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011. pp. 525-530, 12th International Society for Music Information Retrieval Conference, ISMIR 2011, Miami, FL, 11/10/24.
Otsuka T, Nakadai K, Ogata T, Okuno HG. Incremental bayesian audio-to-score alignment with flexible harmonic structure models. In Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011. 2011. p. 525-530
Otsuka, Takuma ; Nakadai, Kazuhiro ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Incremental bayesian audio-to-score alignment with flexible harmonic structure models. Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011. 2011. pp. 525-530
@inproceedings{c6a7ba908716419c85f55d24cec032b6,
title = "Incremental bayesian audio-to-score alignment with flexible harmonic structure models",
abstract = "Music information retrieval, especially the audio-to-score alignment problem, often involves a matching problem between the audio and symbolic representations. We must cope with uncertainty in the audio signal generated from the score in a symbolic representation such as the variation in the timbre or temporal fluctuations. Existing audio-to-score alignment methods are sometimes vulnerable to the uncertainty in which multiple notes are simultaneously played with a variety of timbres because these methods rely on static observation models. For example, a chroma vector or a fixed harmonic structure template is used under the assumption that musical notes in a chord are all in the same volume and timbre. This paper presents a particle filterbased audio-to-score alignment method with a flexible observation model based on latent harmonic allocation. Our method adapts to the harmonic structure for the audio-toscore matching based on the observation of the audio signal through Bayesian inference. Experimental results with 20 polyphonic songs reveal that our method is effective when more number of instruments are involved in the ensemble.",
author = "Takuma Otsuka and Kazuhiro Nakadai and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2011",
language = "English",
isbn = "9780615548654",
pages = "525--530",
booktitle = "Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011",

}

TY - GEN

T1 - Incremental bayesian audio-to-score alignment with flexible harmonic structure models

AU - Otsuka, Takuma

AU - Nakadai, Kazuhiro

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2011

Y1 - 2011

N2 - Music information retrieval, especially the audio-to-score alignment problem, often involves a matching problem between the audio and symbolic representations. We must cope with uncertainty in the audio signal generated from the score in a symbolic representation such as the variation in the timbre or temporal fluctuations. Existing audio-to-score alignment methods are sometimes vulnerable to the uncertainty in which multiple notes are simultaneously played with a variety of timbres because these methods rely on static observation models. For example, a chroma vector or a fixed harmonic structure template is used under the assumption that musical notes in a chord are all in the same volume and timbre. This paper presents a particle filterbased audio-to-score alignment method with a flexible observation model based on latent harmonic allocation. Our method adapts to the harmonic structure for the audio-toscore matching based on the observation of the audio signal through Bayesian inference. Experimental results with 20 polyphonic songs reveal that our method is effective when more number of instruments are involved in the ensemble.

AB - Music information retrieval, especially the audio-to-score alignment problem, often involves a matching problem between the audio and symbolic representations. We must cope with uncertainty in the audio signal generated from the score in a symbolic representation such as the variation in the timbre or temporal fluctuations. Existing audio-to-score alignment methods are sometimes vulnerable to the uncertainty in which multiple notes are simultaneously played with a variety of timbres because these methods rely on static observation models. For example, a chroma vector or a fixed harmonic structure template is used under the assumption that musical notes in a chord are all in the same volume and timbre. This paper presents a particle filterbased audio-to-score alignment method with a flexible observation model based on latent harmonic allocation. Our method adapts to the harmonic structure for the audio-toscore matching based on the observation of the audio signal through Bayesian inference. Experimental results with 20 polyphonic songs reveal that our method is effective when more number of instruments are involved in the ensemble.

UR - http://www.scopus.com/inward/record.url?scp=84873592130&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873592130&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84873592130

SN - 9780615548654

SP - 525

EP - 530

BT - Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011

ER -