A novel framework for recognizing phonemes of singing voice in polyphonic music

Hiromasa Fujihara, Masataka Goto, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

A novel method is described that can be used to recognize the phoneme of a singing voice (vocal) in polyphonic music. Though we focus on the voiced phoneme in this paper, this method is design to concurrently recognize other elements of a singing voice such as fundamental frequency and singer. Thus, this method is considered to be a new framework for recognizing a singing voice in polyphonic music. Our method stochastically models a mixture of a singing voice and other instrumental sounds without segregating the singing voice. It can also estimate a reliable spectral envelope by estimating it from many harmonic structures with various fundamental frequencies (F0s). The results of phoneme recognition experiments with 10 popular-music songs by 6 singers showed that our method improves the recognition accuracy by 8.7 points and achieves a 20.0% decrease in error rate.

Original languageEnglish
Title of host publicationIEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Pages17-20
Number of pages4
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009 - New Paltz, NY
Duration: 2009 Oct 182009 Oct 21

Other

Other2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009
CityNew Paltz, NY
Period09/10/1809/10/21

Fingerprint

Acoustic waves
Experiments

Keywords

  • Mixture of experts
  • Phoneme recognition
  • Singing voice
  • Spectral modeling

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications

Cite this

Fujihara, H., Goto, M., & Okuno, H. G. (2009). A novel framework for recognizing phonemes of singing voice in polyphonic music. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (pp. 17-20). [5346497] https://doi.org/10.1109/ASPAA.2009.5346497

A novel framework for recognizing phonemes of singing voice in polyphonic music. / Fujihara, Hiromasa; Goto, Masataka; Okuno, Hiroshi G.

IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2009. p. 17-20 5346497.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fujihara, H, Goto, M & Okuno, HG 2009, A novel framework for recognizing phonemes of singing voice in polyphonic music. in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics., 5346497, pp. 17-20, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2009, New Paltz, NY, 09/10/18. https://doi.org/10.1109/ASPAA.2009.5346497
Fujihara H, Goto M, Okuno HG. A novel framework for recognizing phonemes of singing voice in polyphonic music. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2009. p. 17-20. 5346497 https://doi.org/10.1109/ASPAA.2009.5346497
Fujihara, Hiromasa ; Goto, Masataka ; Okuno, Hiroshi G. / A novel framework for recognizing phonemes of singing voice in polyphonic music. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 2009. pp. 17-20
@inproceedings{9923df442c6d4cc99c042f497f665ed7,
title = "A novel framework for recognizing phonemes of singing voice in polyphonic music",
abstract = "A novel method is described that can be used to recognize the phoneme of a singing voice (vocal) in polyphonic music. Though we focus on the voiced phoneme in this paper, this method is design to concurrently recognize other elements of a singing voice such as fundamental frequency and singer. Thus, this method is considered to be a new framework for recognizing a singing voice in polyphonic music. Our method stochastically models a mixture of a singing voice and other instrumental sounds without segregating the singing voice. It can also estimate a reliable spectral envelope by estimating it from many harmonic structures with various fundamental frequencies (F0s). The results of phoneme recognition experiments with 10 popular-music songs by 6 singers showed that our method improves the recognition accuracy by 8.7 points and achieves a 20.0{\%} decrease in error rate.",
keywords = "Mixture of experts, Phoneme recognition, Singing voice, Spectral modeling",
author = "Hiromasa Fujihara and Masataka Goto and Okuno, {Hiroshi G.}",
year = "2009",
doi = "10.1109/ASPAA.2009.5346497",
language = "English",
isbn = "9781424436798",
pages = "17--20",
booktitle = "IEEE Workshop on Applications of Signal Processing to Audio and Acoustics",

}

TY - GEN

T1 - A novel framework for recognizing phonemes of singing voice in polyphonic music

AU - Fujihara, Hiromasa

AU - Goto, Masataka

AU - Okuno, Hiroshi G.

PY - 2009

Y1 - 2009

N2 - A novel method is described that can be used to recognize the phoneme of a singing voice (vocal) in polyphonic music. Though we focus on the voiced phoneme in this paper, this method is design to concurrently recognize other elements of a singing voice such as fundamental frequency and singer. Thus, this method is considered to be a new framework for recognizing a singing voice in polyphonic music. Our method stochastically models a mixture of a singing voice and other instrumental sounds without segregating the singing voice. It can also estimate a reliable spectral envelope by estimating it from many harmonic structures with various fundamental frequencies (F0s). The results of phoneme recognition experiments with 10 popular-music songs by 6 singers showed that our method improves the recognition accuracy by 8.7 points and achieves a 20.0% decrease in error rate.

AB - A novel method is described that can be used to recognize the phoneme of a singing voice (vocal) in polyphonic music. Though we focus on the voiced phoneme in this paper, this method is design to concurrently recognize other elements of a singing voice such as fundamental frequency and singer. Thus, this method is considered to be a new framework for recognizing a singing voice in polyphonic music. Our method stochastically models a mixture of a singing voice and other instrumental sounds without segregating the singing voice. It can also estimate a reliable spectral envelope by estimating it from many harmonic structures with various fundamental frequencies (F0s). The results of phoneme recognition experiments with 10 popular-music songs by 6 singers showed that our method improves the recognition accuracy by 8.7 points and achieves a 20.0% decrease in error rate.

KW - Mixture of experts

KW - Phoneme recognition

KW - Singing voice

KW - Spectral modeling

UR - http://www.scopus.com/inward/record.url?scp=77950181219&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77950181219&partnerID=8YFLogxK

U2 - 10.1109/ASPAA.2009.5346497

DO - 10.1109/ASPAA.2009.5346497

M3 - Conference contribution

SN - 9781424436798

SP - 17

EP - 20

BT - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics

ER -