Abstract
This paper proposes a query-by-audio system that aims to detect temporal locations where a musical phrase given as a query is played in musical pieces. The “phrase” in this paper means a short audio excerpt that is not limited to a main melody (singing part) and is usually played by a single musical instrument. A main problem of this task is that the query is often buried in mixture signals consisting of various instruments. To solve this problem, we propose a method that can appropriately calculate the distance between a query and partial components of a musical piece. More specifically, gamma process nonnegative matrix factorization (GaP-NMF) is used for decomposing the spectrogram of the query into an appropriate number of basis spectra and their activation patterns. Semi-supervised GaP-NMF is then used for estimating activation patterns of the learned basis spectra in the musical piece by presuming the piece to partially consist of those spectra. This enables distance calculation based on activation patterns. The experimental results showed that our method outperformed conventional matching methods.
Original language | English |
---|---|
Pages | 227-232 |
Number of pages | 6 |
Publication status | Published - 2014 Jan 1 |
Event | 15th International Society for Music Information Retrieval Conference, ISMIR 2014 - Taipei, Taiwan, Province of China Duration: 2014 Oct 27 → 2014 Oct 31 |
Conference
Conference | 15th International Society for Music Information Retrieval Conference, ISMIR 2014 |
---|---|
Country | Taiwan, Province of China |
City | Taipei |
Period | 14/10/27 → 14/10/31 |
Fingerprint
ASJC Scopus subject areas
- Music
- Information Systems
Cite this
Spotting a query phrase from polyphonic music audio signals based on semi-supervised nonnegative matrix factorization. / Masuda, Taro; Yoshii, Kazuyoshi; Goto, Masataka; Morishima, Shigeo.
2014. 227-232 Paper presented at 15th International Society for Music Information Retrieval Conference, ISMIR 2014, Taipei, Taiwan, Province of China.Research output: Contribution to conference › Paper
}
TY - CONF
T1 - Spotting a query phrase from polyphonic music audio signals based on semi-supervised nonnegative matrix factorization
AU - Masuda, Taro
AU - Yoshii, Kazuyoshi
AU - Goto, Masataka
AU - Morishima, Shigeo
PY - 2014/1/1
Y1 - 2014/1/1
N2 - This paper proposes a query-by-audio system that aims to detect temporal locations where a musical phrase given as a query is played in musical pieces. The “phrase” in this paper means a short audio excerpt that is not limited to a main melody (singing part) and is usually played by a single musical instrument. A main problem of this task is that the query is often buried in mixture signals consisting of various instruments. To solve this problem, we propose a method that can appropriately calculate the distance between a query and partial components of a musical piece. More specifically, gamma process nonnegative matrix factorization (GaP-NMF) is used for decomposing the spectrogram of the query into an appropriate number of basis spectra and their activation patterns. Semi-supervised GaP-NMF is then used for estimating activation patterns of the learned basis spectra in the musical piece by presuming the piece to partially consist of those spectra. This enables distance calculation based on activation patterns. The experimental results showed that our method outperformed conventional matching methods.
AB - This paper proposes a query-by-audio system that aims to detect temporal locations where a musical phrase given as a query is played in musical pieces. The “phrase” in this paper means a short audio excerpt that is not limited to a main melody (singing part) and is usually played by a single musical instrument. A main problem of this task is that the query is often buried in mixture signals consisting of various instruments. To solve this problem, we propose a method that can appropriately calculate the distance between a query and partial components of a musical piece. More specifically, gamma process nonnegative matrix factorization (GaP-NMF) is used for decomposing the spectrogram of the query into an appropriate number of basis spectra and their activation patterns. Semi-supervised GaP-NMF is then used for estimating activation patterns of the learned basis spectra in the musical piece by presuming the piece to partially consist of those spectra. This enables distance calculation based on activation patterns. The experimental results showed that our method outperformed conventional matching methods.
UR - http://www.scopus.com/inward/record.url?scp=84973290458&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84973290458&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:84973290458
SP - 227
EP - 232
ER -