Acoustic features for estimation of perceptional similarity

Yoshihiro Adachi, Shinichi Kawamoto, Shigeo Morishima, Satoshi Nakamura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper describes an examination of acoustic features for the estimation of perceptional similarity between speeches. We firstly extract some acoustic features including personality from speeches of 36 persons. Secondly, we calculate each distance between extracted features using Gaussian Mixture Model (GMM) or Dynamic Time Warping (DTW), and then we sort speeches based on the physical similarity. On the other hand, there is the permutation based on the perceptional similarity which is sorted according to the subject. We evaluate the physical features by the Spearman's rank correlation coefficient with two permutations. Consequently, the results show that DTW distance with high STRAIGHT Cepstrum is an optimum feature for estimation of perceptional similarity.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages306-314
Number of pages9
Volume4810 LNCS
Publication statusPublished - 2007
Externally publishedYes
Event8th Pacific-Rim Conference on Multimedia, PCM 2007 - Hong Kong
Duration: 2007 Dec 112007 Dec 14

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4810 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other8th Pacific-Rim Conference on Multimedia, PCM 2007
CityHong Kong
Period07/12/1107/12/14

Fingerprint

Acoustics
Dynamic Time Warping
Permutation
Cepstrum
Nonparametric Statistics
Spearman's coefficient
Personality
Gaussian Mixture Model
Correlation coefficient
Sort
Person
Calculate
Similarity
Evaluate
Speech

Keywords

  • Acoustic features
  • Perceptional similarity
  • Physical similarity
  • Spearman's rank correlation coefficient

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Adachi, Y., Kawamoto, S., Morishima, S., & Nakamura, S. (2007). Acoustic features for estimation of perceptional similarity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4810 LNCS, pp. 306-314). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4810 LNCS).

Acoustic features for estimation of perceptional similarity. / Adachi, Yoshihiro; Kawamoto, Shinichi; Morishima, Shigeo; Nakamura, Satoshi.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4810 LNCS 2007. p. 306-314 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4810 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Adachi, Y, Kawamoto, S, Morishima, S & Nakamura, S 2007, Acoustic features for estimation of perceptional similarity. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4810 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4810 LNCS, pp. 306-314, 8th Pacific-Rim Conference on Multimedia, PCM 2007, Hong Kong, 07/12/11.
Adachi Y, Kawamoto S, Morishima S, Nakamura S. Acoustic features for estimation of perceptional similarity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4810 LNCS. 2007. p. 306-314. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Adachi, Yoshihiro ; Kawamoto, Shinichi ; Morishima, Shigeo ; Nakamura, Satoshi. / Acoustic features for estimation of perceptional similarity. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4810 LNCS 2007. pp. 306-314 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{8cc1ba3f070e43de8b6e3dd7856b9d1b,
title = "Acoustic features for estimation of perceptional similarity",
abstract = "This paper describes an examination of acoustic features for the estimation of perceptional similarity between speeches. We firstly extract some acoustic features including personality from speeches of 36 persons. Secondly, we calculate each distance between extracted features using Gaussian Mixture Model (GMM) or Dynamic Time Warping (DTW), and then we sort speeches based on the physical similarity. On the other hand, there is the permutation based on the perceptional similarity which is sorted according to the subject. We evaluate the physical features by the Spearman's rank correlation coefficient with two permutations. Consequently, the results show that DTW distance with high STRAIGHT Cepstrum is an optimum feature for estimation of perceptional similarity.",
keywords = "Acoustic features, Perceptional similarity, Physical similarity, Spearman's rank correlation coefficient",
author = "Yoshihiro Adachi and Shinichi Kawamoto and Shigeo Morishima and Satoshi Nakamura",
year = "2007",
language = "English",
isbn = "9783540772545",
volume = "4810 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "306--314",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Acoustic features for estimation of perceptional similarity

AU - Adachi, Yoshihiro

AU - Kawamoto, Shinichi

AU - Morishima, Shigeo

AU - Nakamura, Satoshi

PY - 2007

Y1 - 2007

N2 - This paper describes an examination of acoustic features for the estimation of perceptional similarity between speeches. We firstly extract some acoustic features including personality from speeches of 36 persons. Secondly, we calculate each distance between extracted features using Gaussian Mixture Model (GMM) or Dynamic Time Warping (DTW), and then we sort speeches based on the physical similarity. On the other hand, there is the permutation based on the perceptional similarity which is sorted according to the subject. We evaluate the physical features by the Spearman's rank correlation coefficient with two permutations. Consequently, the results show that DTW distance with high STRAIGHT Cepstrum is an optimum feature for estimation of perceptional similarity.

AB - This paper describes an examination of acoustic features for the estimation of perceptional similarity between speeches. We firstly extract some acoustic features including personality from speeches of 36 persons. Secondly, we calculate each distance between extracted features using Gaussian Mixture Model (GMM) or Dynamic Time Warping (DTW), and then we sort speeches based on the physical similarity. On the other hand, there is the permutation based on the perceptional similarity which is sorted according to the subject. We evaluate the physical features by the Spearman's rank correlation coefficient with two permutations. Consequently, the results show that DTW distance with high STRAIGHT Cepstrum is an optimum feature for estimation of perceptional similarity.

KW - Acoustic features

KW - Perceptional similarity

KW - Physical similarity

KW - Spearman's rank correlation coefficient

UR - http://www.scopus.com/inward/record.url?scp=38349082361&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=38349082361&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:38349082361

SN - 9783540772545

VL - 4810 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 306

EP - 314

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -