Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection

Akira Maezawa, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper proposes 'audio part mixture alignment,' a method for temporally aligning multiple audio signals, each of which is a rendition of a non-disjoint subset of a common piece of music. The method decomposes each audio signal into shared components and components unique to each rendition. At the same time, it aligns each audio signal based on the shared component. Decomposition of audio signal is modeled using a hierarchical Dirichlet process (Hierarchical DP, HDP), and sequence alignment is modeled as a left-to-right hidden Markov model (HMM). Variational Bayesian inference is used to jointly infer the alignment and component decomposition. The proposed method is compared with a classic audio-to-audio alignment method, and it is found that the proposed method is more robust to the discrepancy of parts between two audio signals.

Original languageEnglish
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5212-5216
Number of pages5
ISBN (Print)9781479928927
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence
Duration: 2014 May 42014 May 9

Other

Other2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
CityFlorence
Period14/5/414/5/9

Fingerprint

Decomposition
Hidden Markov models

Keywords

  • Audio-audio alignment
  • Nonparametric hierarchical bayes

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Maezawa, A., & Okuno, H. G. (2014). Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp. 5212-5216). [6854597] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2014.6854597

Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection. / Maezawa, Akira; Okuno, Hiroshi G.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2014. p. 5212-5216 6854597.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Maezawa, A & Okuno, HG 2014, Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings., 6854597, Institute of Electrical and Electronics Engineers Inc., pp. 5212-5216, 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, Florence, 14/5/4. https://doi.org/10.1109/ICASSP.2014.6854597
Maezawa A, Okuno HG. Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2014. p. 5212-5216. 6854597 https://doi.org/10.1109/ICASSP.2014.6854597
Maezawa, Akira ; Okuno, Hiroshi G. / Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 5212-5216
@inproceedings{2e4c857cd284477cbdfc5d3dff3d60e6,
title = "Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection",
abstract = "This paper proposes 'audio part mixture alignment,' a method for temporally aligning multiple audio signals, each of which is a rendition of a non-disjoint subset of a common piece of music. The method decomposes each audio signal into shared components and components unique to each rendition. At the same time, it aligns each audio signal based on the shared component. Decomposition of audio signal is modeled using a hierarchical Dirichlet process (Hierarchical DP, HDP), and sequence alignment is modeled as a left-to-right hidden Markov model (HMM). Variational Bayesian inference is used to jointly infer the alignment and component decomposition. The proposed method is compared with a classic audio-to-audio alignment method, and it is found that the proposed method is more robust to the discrepancy of parts between two audio signals.",
keywords = "Audio-audio alignment, Nonparametric hierarchical bayes",
author = "Akira Maezawa and Okuno, {Hiroshi G.}",
year = "2014",
doi = "10.1109/ICASSP.2014.6854597",
language = "English",
isbn = "9781479928927",
pages = "5212--5216",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection

AU - Maezawa, Akira

AU - Okuno, Hiroshi G.

PY - 2014

Y1 - 2014

N2 - This paper proposes 'audio part mixture alignment,' a method for temporally aligning multiple audio signals, each of which is a rendition of a non-disjoint subset of a common piece of music. The method decomposes each audio signal into shared components and components unique to each rendition. At the same time, it aligns each audio signal based on the shared component. Decomposition of audio signal is modeled using a hierarchical Dirichlet process (Hierarchical DP, HDP), and sequence alignment is modeled as a left-to-right hidden Markov model (HMM). Variational Bayesian inference is used to jointly infer the alignment and component decomposition. The proposed method is compared with a classic audio-to-audio alignment method, and it is found that the proposed method is more robust to the discrepancy of parts between two audio signals.

AB - This paper proposes 'audio part mixture alignment,' a method for temporally aligning multiple audio signals, each of which is a rendition of a non-disjoint subset of a common piece of music. The method decomposes each audio signal into shared components and components unique to each rendition. At the same time, it aligns each audio signal based on the shared component. Decomposition of audio signal is modeled using a hierarchical Dirichlet process (Hierarchical DP, HDP), and sequence alignment is modeled as a left-to-right hidden Markov model (HMM). Variational Bayesian inference is used to jointly infer the alignment and component decomposition. The proposed method is compared with a classic audio-to-audio alignment method, and it is found that the proposed method is more robust to the discrepancy of parts between two audio signals.

KW - Audio-audio alignment

KW - Nonparametric hierarchical bayes

UR - http://www.scopus.com/inward/record.url?scp=84905240730&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905240730&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2014.6854597

DO - 10.1109/ICASSP.2014.6854597

M3 - Conference contribution

AN - SCOPUS:84905240730

SN - 9781479928927

SP - 5212

EP - 5216

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -