Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals

Toru Taniguchi, Akishige Adachi, Shigeki Okawa, Masaaki Honda, Katsuhiko Shirai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

We developed a method for discriminating speech, musical instruments and singing voices based on sinusoidal decomposition of audio signals. Although many studies have been conducted, few have worked on the problem of the temporal overlapping of the categories of sounds. In order to cope with such problems, we used sinusoidal segments with variable lengths as the discrimination units, although most of traditional work has used fixed-length units. The discrimination is based on the temporal characteristics of the sinusoidal segments. We achieved an average discrimination rate of 71.56% in classifying sinusoidal segments in non-mixed audio data. In the time segments, the accuracy 87.9% in non-mixed-category audio data and 66.4% in 2-mixed-category are achieved. In the comparison of the proposed and the MFCC methods, the effectiveness of temporal features and the importance of the use of both the spectral and temporal characteristics were proved.

Original languageEnglish
Title of host publication9th European Conference on Speech Communication and Technology
Pages589-592
Number of pages4
Publication statusPublished - 2005
Event9th European Conference on Speech Communication and Technology - Lisbon
Duration: 2005 Sep 42005 Sep 8

Other

Other9th European Conference on Speech Communication and Technology
CityLisbon
Period05/9/405/9/8

Fingerprint

Musical instruments
Acoustic waves
Decomposition

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Taniguchi, T., Adachi, A., Okawa, S., Honda, M., & Shirai, K. (2005). Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals. In 9th European Conference on Speech Communication and Technology (pp. 589-592)

Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals. / Taniguchi, Toru; Adachi, Akishige; Okawa, Shigeki; Honda, Masaaki; Shirai, Katsuhiko.

9th European Conference on Speech Communication and Technology. 2005. p. 589-592.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Taniguchi, T, Adachi, A, Okawa, S, Honda, M & Shirai, K 2005, Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals. in 9th European Conference on Speech Communication and Technology. pp. 589-592, 9th European Conference on Speech Communication and Technology, Lisbon, 05/9/4.
Taniguchi T, Adachi A, Okawa S, Honda M, Shirai K. Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals. In 9th European Conference on Speech Communication and Technology. 2005. p. 589-592
Taniguchi, Toru ; Adachi, Akishige ; Okawa, Shigeki ; Honda, Masaaki ; Shirai, Katsuhiko. / Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals. 9th European Conference on Speech Communication and Technology. 2005. pp. 589-592
@inproceedings{05ae4ede4d3d45be944fc5b108bc0f38,
title = "Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals",
abstract = "We developed a method for discriminating speech, musical instruments and singing voices based on sinusoidal decomposition of audio signals. Although many studies have been conducted, few have worked on the problem of the temporal overlapping of the categories of sounds. In order to cope with such problems, we used sinusoidal segments with variable lengths as the discrimination units, although most of traditional work has used fixed-length units. The discrimination is based on the temporal characteristics of the sinusoidal segments. We achieved an average discrimination rate of 71.56{\%} in classifying sinusoidal segments in non-mixed audio data. In the time segments, the accuracy 87.9{\%} in non-mixed-category audio data and 66.4{\%} in 2-mixed-category are achieved. In the comparison of the proposed and the MFCC methods, the effectiveness of temporal features and the importance of the use of both the spectral and temporal characteristics were proved.",
author = "Toru Taniguchi and Akishige Adachi and Shigeki Okawa and Masaaki Honda and Katsuhiko Shirai",
year = "2005",
language = "English",
pages = "589--592",
booktitle = "9th European Conference on Speech Communication and Technology",

}

TY - GEN

T1 - Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals

AU - Taniguchi, Toru

AU - Adachi, Akishige

AU - Okawa, Shigeki

AU - Honda, Masaaki

AU - Shirai, Katsuhiko

PY - 2005

Y1 - 2005

N2 - We developed a method for discriminating speech, musical instruments and singing voices based on sinusoidal decomposition of audio signals. Although many studies have been conducted, few have worked on the problem of the temporal overlapping of the categories of sounds. In order to cope with such problems, we used sinusoidal segments with variable lengths as the discrimination units, although most of traditional work has used fixed-length units. The discrimination is based on the temporal characteristics of the sinusoidal segments. We achieved an average discrimination rate of 71.56% in classifying sinusoidal segments in non-mixed audio data. In the time segments, the accuracy 87.9% in non-mixed-category audio data and 66.4% in 2-mixed-category are achieved. In the comparison of the proposed and the MFCC methods, the effectiveness of temporal features and the importance of the use of both the spectral and temporal characteristics were proved.

AB - We developed a method for discriminating speech, musical instruments and singing voices based on sinusoidal decomposition of audio signals. Although many studies have been conducted, few have worked on the problem of the temporal overlapping of the categories of sounds. In order to cope with such problems, we used sinusoidal segments with variable lengths as the discrimination units, although most of traditional work has used fixed-length units. The discrimination is based on the temporal characteristics of the sinusoidal segments. We achieved an average discrimination rate of 71.56% in classifying sinusoidal segments in non-mixed audio data. In the time segments, the accuracy 87.9% in non-mixed-category audio data and 66.4% in 2-mixed-category are achieved. In the comparison of the proposed and the MFCC methods, the effectiveness of temporal features and the importance of the use of both the spectral and temporal characteristics were proved.

UR - http://www.scopus.com/inward/record.url?scp=33745184907&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33745184907&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:33745184907

SP - 589

EP - 592

BT - 9th European Conference on Speech Communication and Technology

ER -