Multi-level speech emotion recognition based on Fisher criterion and SVM

Li Jiang Chen, Xia Mao, Mitsuru Ishizuka

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

To solve the speaker independent emotion recognition problem, a multi-level speech emotion recognition system is proposed to classify 6 speech emotions, including sadness, anger, surprise, fear, happiness and disgust from coarse to fine. The key is that the emotions divided by each layer are closely related to the emotional features of speech. For each level, appropriate features are selected from 288 candidate features by Fisher ratio which is also regarded as input parameter for the training of support vector machine (SVM). Based on Beihang emotional speech database and Berlin emotional speech database, principal component analysis (PCA) for dimension reduction and Artificial Neural Network (ANN) for classification are adopted to design 4 comparative experiments, including Fisher+SVM, PCA+SVM, Fisher+ANN, PCA+ANN. The experimental results prove that Fisher rule is better than PCA for dimension reduction, and SVM is more expansible than ANN for speaker independent speech emotion recognition. Good cross-cultural adaptation can be inferred from the similar results of experiments based on two different databases.

Original languageEnglish
Pages (from-to)604-609
Number of pages6
JournalMoshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence
Volume25
Issue number4
Publication statusPublished - 2012 Aug
Externally publishedYes

Fingerprint

Speech recognition
Principal component analysis
Support vector machines
Neural networks
Experiments

Keywords

  • Fisher criterion
  • Speaker independent
  • Speech emotion recognition
  • Support vector machine

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Multi-level speech emotion recognition based on Fisher criterion and SVM. / Chen, Li Jiang; Mao, Xia; Ishizuka, Mitsuru.

In: Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, Vol. 25, No. 4, 08.2012, p. 604-609.

Research output: Contribution to journalArticle

@article{c4a60395918f4fd7a6c325e07a60603f,
title = "Multi-level speech emotion recognition based on Fisher criterion and SVM",
abstract = "To solve the speaker independent emotion recognition problem, a multi-level speech emotion recognition system is proposed to classify 6 speech emotions, including sadness, anger, surprise, fear, happiness and disgust from coarse to fine. The key is that the emotions divided by each layer are closely related to the emotional features of speech. For each level, appropriate features are selected from 288 candidate features by Fisher ratio which is also regarded as input parameter for the training of support vector machine (SVM). Based on Beihang emotional speech database and Berlin emotional speech database, principal component analysis (PCA) for dimension reduction and Artificial Neural Network (ANN) for classification are adopted to design 4 comparative experiments, including Fisher+SVM, PCA+SVM, Fisher+ANN, PCA+ANN. The experimental results prove that Fisher rule is better than PCA for dimension reduction, and SVM is more expansible than ANN for speaker independent speech emotion recognition. Good cross-cultural adaptation can be inferred from the similar results of experiments based on two different databases.",
keywords = "Fisher criterion, Speaker independent, Speech emotion recognition, Support vector machine",
author = "Chen, {Li Jiang} and Xia Mao and Mitsuru Ishizuka",
year = "2012",
month = "8",
language = "English",
volume = "25",
pages = "604--609",
journal = "Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence",
issn = "1003-6059",
publisher = "Journal of Pattern Recognition and Artificial Intelligence",
number = "4",

}

TY - JOUR

T1 - Multi-level speech emotion recognition based on Fisher criterion and SVM

AU - Chen, Li Jiang

AU - Mao, Xia

AU - Ishizuka, Mitsuru

PY - 2012/8

Y1 - 2012/8

N2 - To solve the speaker independent emotion recognition problem, a multi-level speech emotion recognition system is proposed to classify 6 speech emotions, including sadness, anger, surprise, fear, happiness and disgust from coarse to fine. The key is that the emotions divided by each layer are closely related to the emotional features of speech. For each level, appropriate features are selected from 288 candidate features by Fisher ratio which is also regarded as input parameter for the training of support vector machine (SVM). Based on Beihang emotional speech database and Berlin emotional speech database, principal component analysis (PCA) for dimension reduction and Artificial Neural Network (ANN) for classification are adopted to design 4 comparative experiments, including Fisher+SVM, PCA+SVM, Fisher+ANN, PCA+ANN. The experimental results prove that Fisher rule is better than PCA for dimension reduction, and SVM is more expansible than ANN for speaker independent speech emotion recognition. Good cross-cultural adaptation can be inferred from the similar results of experiments based on two different databases.

AB - To solve the speaker independent emotion recognition problem, a multi-level speech emotion recognition system is proposed to classify 6 speech emotions, including sadness, anger, surprise, fear, happiness and disgust from coarse to fine. The key is that the emotions divided by each layer are closely related to the emotional features of speech. For each level, appropriate features are selected from 288 candidate features by Fisher ratio which is also regarded as input parameter for the training of support vector machine (SVM). Based on Beihang emotional speech database and Berlin emotional speech database, principal component analysis (PCA) for dimension reduction and Artificial Neural Network (ANN) for classification are adopted to design 4 comparative experiments, including Fisher+SVM, PCA+SVM, Fisher+ANN, PCA+ANN. The experimental results prove that Fisher rule is better than PCA for dimension reduction, and SVM is more expansible than ANN for speaker independent speech emotion recognition. Good cross-cultural adaptation can be inferred from the similar results of experiments based on two different databases.

KW - Fisher criterion

KW - Speaker independent

KW - Speech emotion recognition

KW - Support vector machine

UR - http://www.scopus.com/inward/record.url?scp=84867418522&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867418522&partnerID=8YFLogxK

M3 - Article

VL - 25

SP - 604

EP - 609

JO - Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence

JF - Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence

SN - 1003-6059

IS - 4

ER -