Large Scale Environmental Sound Classification Based on Efficient Feature Extraction

Xiaoyan Wang, Hao Zhou, Zhi Liu, Yu Gu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years, plenty of studies endeavor to analyze the life auditory scenarios via mining non-speech sounds. Conventional audio recognition schemes clearly bound the feature extraction and recognition stages, such as in speech recognition. However, such separation leads to inconsistency in the purposes at each stage. The recognition stage contributes to portray the global data distribution focusing on 'relationship' between signal samples. However, such consideration can hardly be embedded into feature extraction process which centered on the local structure, thus, the prominent 'relation' information is destroyed. In this paper, we propose a unified acoustic recognition framework taking advantage of primitive feature input without injuring discriminant information and adopting effective classification scheme accordingly. We formulate the sound into subspace representation and initially adopt Grassmannian distance to classify the subspace-indexed non-speech sounds. To validate the proposed framework, we conducted experiments using RWCP Sound Scene Database. The experimental results demonstrated the proposed framework achieved fine recognition performance with high efficiency.

Original languageEnglish
Title of host publicationProceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages421-425
Number of pages5
Volume2016-September
ISBN (Electronic)9781509028252
DOIs
Publication statusPublished - 2016 Sep 23
Event45th International Conference on Parallel Processing Workshops, ICPPW 2016 - Philadelphia, United States
Duration: 2016 Aug 162016 Aug 19

Other

Other45th International Conference on Parallel Processing Workshops, ICPPW 2016
CountryUnited States
CityPhiladelphia
Period16/8/1616/8/19

Fingerprint

Feature Extraction
Feature extraction
Acoustic waves
Subspace
Feature Recognition
Grassmannian
Local Structure
Data Distribution
Speech Recognition
Speech recognition
Discriminant
Inconsistency
High Efficiency
Mining
Acoustics
Classify
Scenarios
Sound
Experimental Results
Experiment

Keywords

  • Grassmann manifold
  • non-speech sound recognition
  • spectrogram
  • subspace learning

ASJC Scopus subject areas

  • Software
  • Mathematics(all)
  • Hardware and Architecture

Cite this

Wang, X., Zhou, H., Liu, Z., & Gu, Y. (2016). Large Scale Environmental Sound Classification Based on Efficient Feature Extraction. In Proceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016 (Vol. 2016-September, pp. 421-425). [7576494] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICPPW.2016.64

Large Scale Environmental Sound Classification Based on Efficient Feature Extraction. / Wang, Xiaoyan; Zhou, Hao; Liu, Zhi; Gu, Yu.

Proceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016. Vol. 2016-September Institute of Electrical and Electronics Engineers Inc., 2016. p. 421-425 7576494.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wang, X, Zhou, H, Liu, Z & Gu, Y 2016, Large Scale Environmental Sound Classification Based on Efficient Feature Extraction. in Proceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016. vol. 2016-September, 7576494, Institute of Electrical and Electronics Engineers Inc., pp. 421-425, 45th International Conference on Parallel Processing Workshops, ICPPW 2016, Philadelphia, United States, 16/8/16. https://doi.org/10.1109/ICPPW.2016.64
Wang X, Zhou H, Liu Z, Gu Y. Large Scale Environmental Sound Classification Based on Efficient Feature Extraction. In Proceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016. Vol. 2016-September. Institute of Electrical and Electronics Engineers Inc. 2016. p. 421-425. 7576494 https://doi.org/10.1109/ICPPW.2016.64
Wang, Xiaoyan ; Zhou, Hao ; Liu, Zhi ; Gu, Yu. / Large Scale Environmental Sound Classification Based on Efficient Feature Extraction. Proceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016. Vol. 2016-September Institute of Electrical and Electronics Engineers Inc., 2016. pp. 421-425
@inproceedings{8d3b2248d2a945fa8dc20f164cf1d236,
title = "Large Scale Environmental Sound Classification Based on Efficient Feature Extraction",
abstract = "In recent years, plenty of studies endeavor to analyze the life auditory scenarios via mining non-speech sounds. Conventional audio recognition schemes clearly bound the feature extraction and recognition stages, such as in speech recognition. However, such separation leads to inconsistency in the purposes at each stage. The recognition stage contributes to portray the global data distribution focusing on 'relationship' between signal samples. However, such consideration can hardly be embedded into feature extraction process which centered on the local structure, thus, the prominent 'relation' information is destroyed. In this paper, we propose a unified acoustic recognition framework taking advantage of primitive feature input without injuring discriminant information and adopting effective classification scheme accordingly. We formulate the sound into subspace representation and initially adopt Grassmannian distance to classify the subspace-indexed non-speech sounds. To validate the proposed framework, we conducted experiments using RWCP Sound Scene Database. The experimental results demonstrated the proposed framework achieved fine recognition performance with high efficiency.",
keywords = "Grassmann manifold, non-speech sound recognition, spectrogram, subspace learning",
author = "Xiaoyan Wang and Hao Zhou and Zhi Liu and Yu Gu",
year = "2016",
month = "9",
day = "23",
doi = "10.1109/ICPPW.2016.64",
language = "English",
volume = "2016-September",
pages = "421--425",
booktitle = "Proceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Large Scale Environmental Sound Classification Based on Efficient Feature Extraction

AU - Wang, Xiaoyan

AU - Zhou, Hao

AU - Liu, Zhi

AU - Gu, Yu

PY - 2016/9/23

Y1 - 2016/9/23

N2 - In recent years, plenty of studies endeavor to analyze the life auditory scenarios via mining non-speech sounds. Conventional audio recognition schemes clearly bound the feature extraction and recognition stages, such as in speech recognition. However, such separation leads to inconsistency in the purposes at each stage. The recognition stage contributes to portray the global data distribution focusing on 'relationship' between signal samples. However, such consideration can hardly be embedded into feature extraction process which centered on the local structure, thus, the prominent 'relation' information is destroyed. In this paper, we propose a unified acoustic recognition framework taking advantage of primitive feature input without injuring discriminant information and adopting effective classification scheme accordingly. We formulate the sound into subspace representation and initially adopt Grassmannian distance to classify the subspace-indexed non-speech sounds. To validate the proposed framework, we conducted experiments using RWCP Sound Scene Database. The experimental results demonstrated the proposed framework achieved fine recognition performance with high efficiency.

AB - In recent years, plenty of studies endeavor to analyze the life auditory scenarios via mining non-speech sounds. Conventional audio recognition schemes clearly bound the feature extraction and recognition stages, such as in speech recognition. However, such separation leads to inconsistency in the purposes at each stage. The recognition stage contributes to portray the global data distribution focusing on 'relationship' between signal samples. However, such consideration can hardly be embedded into feature extraction process which centered on the local structure, thus, the prominent 'relation' information is destroyed. In this paper, we propose a unified acoustic recognition framework taking advantage of primitive feature input without injuring discriminant information and adopting effective classification scheme accordingly. We formulate the sound into subspace representation and initially adopt Grassmannian distance to classify the subspace-indexed non-speech sounds. To validate the proposed framework, we conducted experiments using RWCP Sound Scene Database. The experimental results demonstrated the proposed framework achieved fine recognition performance with high efficiency.

KW - Grassmann manifold

KW - non-speech sound recognition

KW - spectrogram

KW - subspace learning

UR - http://www.scopus.com/inward/record.url?scp=84990922170&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84990922170&partnerID=8YFLogxK

U2 - 10.1109/ICPPW.2016.64

DO - 10.1109/ICPPW.2016.64

M3 - Conference contribution

AN - SCOPUS:84990922170

VL - 2016-September

SP - 421

EP - 425

BT - Proceedings - 45th International Conference on Parallel Processing Workshops, ICPPW 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -