Singer identification based on accompaniment sound reduction and reliable frame selection

Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

44 Citations (Scopus)

Abstract

This paper describes a method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments. Because singing voices play an important role in musical pieces with a vocal part, the identification of singer names is useful for music information retrieval systems. The main problem in automatically identifying singers is the negative influences caused by accompaniment sounds. To solve this problem, we developed two methods, accompaniment sound reduction and reliable frame selection. The former method makes it possible to identify the singer of a singing voice after reducing accompaniment sounds. It first extracts harmonic components of the predominant melody from sound mixtures and then resynthesizes the melody by using a sinusoidal model driven by those components. The latter method then judges whether each frame of the obtained melody is reliable (i.e. little influenced by accompaniment sound) or not by using two Gaussian mixture models for vocal and non-vocal frames. It enables the singer identification using only reliable vocal portions of musical pieces. Experimental results with forty popular-music songs by ten singers showed that our method was able to reduce the influences of accompaniment sounds and achieved an accuracy of 95%, while the accuracy for a conventional method was 53%.

Original languageEnglish
Title of host publicationISMIR 2005 - 6th International Conference on Music Information Retrieval
Pages329-336
Number of pages8
Publication statusPublished - 2005
Externally publishedYes
Event6th International Conference on Music Information Retrieval, ISMIR 2005 - London
Duration: 2005 Sep 112005 Sep 15

Other

Other6th International Conference on Music Information Retrieval, ISMIR 2005
CityLondon
Period05/9/1105/9/15

Fingerprint

Acoustic waves
Computer music
Information retrieval systems
Sound
Accompaniment
Singers
Melody

Keywords

  • Artist identification
  • Melody extraction
  • Similarity-based MIR
  • Singer identification
  • Singing detection

ASJC Scopus subject areas

  • Music
  • Information Systems

Cite this

Fujihara, H., Kitahara, T., Goto, M., Komatani, K., Ogata, T., & Okuno, H. G. (2005). Singer identification based on accompaniment sound reduction and reliable frame selection. In ISMIR 2005 - 6th International Conference on Music Information Retrieval (pp. 329-336)

Singer identification based on accompaniment sound reduction and reliable frame selection. / Fujihara, Hiromasa; Kitahara, Tetsuro; Goto, Masataka; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

ISMIR 2005 - 6th International Conference on Music Information Retrieval. 2005. p. 329-336.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fujihara, H, Kitahara, T, Goto, M, Komatani, K, Ogata, T & Okuno, HG 2005, Singer identification based on accompaniment sound reduction and reliable frame selection. in ISMIR 2005 - 6th International Conference on Music Information Retrieval. pp. 329-336, 6th International Conference on Music Information Retrieval, ISMIR 2005, London, 05/9/11.
Fujihara H, Kitahara T, Goto M, Komatani K, Ogata T, Okuno HG. Singer identification based on accompaniment sound reduction and reliable frame selection. In ISMIR 2005 - 6th International Conference on Music Information Retrieval. 2005. p. 329-336
Fujihara, Hiromasa ; Kitahara, Tetsuro ; Goto, Masataka ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Singer identification based on accompaniment sound reduction and reliable frame selection. ISMIR 2005 - 6th International Conference on Music Information Retrieval. 2005. pp. 329-336
@inproceedings{6b023a9e93364a5cb325576a35dd3cff,
title = "Singer identification based on accompaniment sound reduction and reliable frame selection",
abstract = "This paper describes a method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments. Because singing voices play an important role in musical pieces with a vocal part, the identification of singer names is useful for music information retrieval systems. The main problem in automatically identifying singers is the negative influences caused by accompaniment sounds. To solve this problem, we developed two methods, accompaniment sound reduction and reliable frame selection. The former method makes it possible to identify the singer of a singing voice after reducing accompaniment sounds. It first extracts harmonic components of the predominant melody from sound mixtures and then resynthesizes the melody by using a sinusoidal model driven by those components. The latter method then judges whether each frame of the obtained melody is reliable (i.e. little influenced by accompaniment sound) or not by using two Gaussian mixture models for vocal and non-vocal frames. It enables the singer identification using only reliable vocal portions of musical pieces. Experimental results with forty popular-music songs by ten singers showed that our method was able to reduce the influences of accompaniment sounds and achieved an accuracy of 95{\%}, while the accuracy for a conventional method was 53{\%}.",
keywords = "Artist identification, Melody extraction, Similarity-based MIR, Singer identification, Singing detection",
author = "Hiromasa Fujihara and Tetsuro Kitahara and Masataka Goto and Kazunori Komatani and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2005",
language = "English",
isbn = "9780955117909",
pages = "329--336",
booktitle = "ISMIR 2005 - 6th International Conference on Music Information Retrieval",

}

TY - GEN

T1 - Singer identification based on accompaniment sound reduction and reliable frame selection

AU - Fujihara, Hiromasa

AU - Kitahara, Tetsuro

AU - Goto, Masataka

AU - Komatani, Kazunori

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2005

Y1 - 2005

N2 - This paper describes a method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments. Because singing voices play an important role in musical pieces with a vocal part, the identification of singer names is useful for music information retrieval systems. The main problem in automatically identifying singers is the negative influences caused by accompaniment sounds. To solve this problem, we developed two methods, accompaniment sound reduction and reliable frame selection. The former method makes it possible to identify the singer of a singing voice after reducing accompaniment sounds. It first extracts harmonic components of the predominant melody from sound mixtures and then resynthesizes the melody by using a sinusoidal model driven by those components. The latter method then judges whether each frame of the obtained melody is reliable (i.e. little influenced by accompaniment sound) or not by using two Gaussian mixture models for vocal and non-vocal frames. It enables the singer identification using only reliable vocal portions of musical pieces. Experimental results with forty popular-music songs by ten singers showed that our method was able to reduce the influences of accompaniment sounds and achieved an accuracy of 95%, while the accuracy for a conventional method was 53%.

AB - This paper describes a method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments. Because singing voices play an important role in musical pieces with a vocal part, the identification of singer names is useful for music information retrieval systems. The main problem in automatically identifying singers is the negative influences caused by accompaniment sounds. To solve this problem, we developed two methods, accompaniment sound reduction and reliable frame selection. The former method makes it possible to identify the singer of a singing voice after reducing accompaniment sounds. It first extracts harmonic components of the predominant melody from sound mixtures and then resynthesizes the melody by using a sinusoidal model driven by those components. The latter method then judges whether each frame of the obtained melody is reliable (i.e. little influenced by accompaniment sound) or not by using two Gaussian mixture models for vocal and non-vocal frames. It enables the singer identification using only reliable vocal portions of musical pieces. Experimental results with forty popular-music songs by ten singers showed that our method was able to reduce the influences of accompaniment sounds and achieved an accuracy of 95%, while the accuracy for a conventional method was 53%.

KW - Artist identification

KW - Melody extraction

KW - Similarity-based MIR

KW - Singer identification

KW - Singing detection

UR - http://www.scopus.com/inward/record.url?scp=84873533890&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873533890&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84873533890

SN - 9780955117909

SP - 329

EP - 336

BT - ISMIR 2005 - 6th International Conference on Music Information Retrieval

ER -