Discriminative approach to dynamic variance adaptation for noisy speech recognition

Marc Delcroix, Shinji Watanabe, Tomohiro Nakatani, Atsushi Nakamura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The performance of automatic speech recognition suffers from severe degradation in the presence of noise or reverberation. One conventional approach for handling such acoustic distortions is to use a speech enhancement technique prior to recognition. However, most speech enhancement techniques introduce artifacts that create a mismatch between the enhanced speech features and the acoustic model used for recognition, therefore limiting the improvement in recognition performance. Recently, there has been increased interest in methods capable of compensating for such a mismatch by accounting for the feature variance during decoding. In this paper, we propose to estimate the feature variance using an adaptation technique based on a discriminative criterion. In an experiment using the Aurora2 database, the proposed method could achieve significant digit error rate reduction compared with a spectral subtraction pre-processor, and using a discriminative criterion for adaptation provided further improvement compared with maximum likelihood estimation.

Original languageEnglish
Title of host publication2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11
Pages7-12
Number of pages6
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11 - Edinburgh, United Kingdom
Duration: 2011 May 302011 Jun 1

Other

Other2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11
CountryUnited Kingdom
CityEdinburgh
Period11/5/3011/6/1

Fingerprint

Speech enhancement
Speech recognition
Acoustic distortion
Reverberation
Maximum likelihood estimation
mismatch
Acoustic noise
acoustics
Decoding
Acoustics
Degradation
performance
artifact
Experiments
experiment

Keywords

  • MMI
  • Model Adaptation
  • Noise reduction
  • Robust ASR
  • Variance Compensation

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing
  • Communication

Cite this

Delcroix, M., Watanabe, S., Nakatani, T., & Nakamura, A. (2011). Discriminative approach to dynamic variance adaptation for noisy speech recognition. In 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11 (pp. 7-12). [5942414] https://doi.org/10.1109/HSCMA.2011.5942414

Discriminative approach to dynamic variance adaptation for noisy speech recognition. / Delcroix, Marc; Watanabe, Shinji; Nakatani, Tomohiro; Nakamura, Atsushi.

2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11. 2011. p. 7-12 5942414.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Delcroix, M, Watanabe, S, Nakatani, T & Nakamura, A 2011, Discriminative approach to dynamic variance adaptation for noisy speech recognition. in 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11., 5942414, pp. 7-12, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11, Edinburgh, United Kingdom, 11/5/30. https://doi.org/10.1109/HSCMA.2011.5942414
Delcroix M, Watanabe S, Nakatani T, Nakamura A. Discriminative approach to dynamic variance adaptation for noisy speech recognition. In 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11. 2011. p. 7-12. 5942414 https://doi.org/10.1109/HSCMA.2011.5942414
Delcroix, Marc ; Watanabe, Shinji ; Nakatani, Tomohiro ; Nakamura, Atsushi. / Discriminative approach to dynamic variance adaptation for noisy speech recognition. 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11. 2011. pp. 7-12
@inproceedings{005e71377c6e40a194a95cb1578c7226,
title = "Discriminative approach to dynamic variance adaptation for noisy speech recognition",
abstract = "The performance of automatic speech recognition suffers from severe degradation in the presence of noise or reverberation. One conventional approach for handling such acoustic distortions is to use a speech enhancement technique prior to recognition. However, most speech enhancement techniques introduce artifacts that create a mismatch between the enhanced speech features and the acoustic model used for recognition, therefore limiting the improvement in recognition performance. Recently, there has been increased interest in methods capable of compensating for such a mismatch by accounting for the feature variance during decoding. In this paper, we propose to estimate the feature variance using an adaptation technique based on a discriminative criterion. In an experiment using the Aurora2 database, the proposed method could achieve significant digit error rate reduction compared with a spectral subtraction pre-processor, and using a discriminative criterion for adaptation provided further improvement compared with maximum likelihood estimation.",
keywords = "MMI, Model Adaptation, Noise reduction, Robust ASR, Variance Compensation",
author = "Marc Delcroix and Shinji Watanabe and Tomohiro Nakatani and Atsushi Nakamura",
year = "2011",
doi = "10.1109/HSCMA.2011.5942414",
language = "English",
isbn = "9781457709999",
pages = "7--12",
booktitle = "2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11",

}

TY - GEN

T1 - Discriminative approach to dynamic variance adaptation for noisy speech recognition

AU - Delcroix, Marc

AU - Watanabe, Shinji

AU - Nakatani, Tomohiro

AU - Nakamura, Atsushi

PY - 2011

Y1 - 2011

N2 - The performance of automatic speech recognition suffers from severe degradation in the presence of noise or reverberation. One conventional approach for handling such acoustic distortions is to use a speech enhancement technique prior to recognition. However, most speech enhancement techniques introduce artifacts that create a mismatch between the enhanced speech features and the acoustic model used for recognition, therefore limiting the improvement in recognition performance. Recently, there has been increased interest in methods capable of compensating for such a mismatch by accounting for the feature variance during decoding. In this paper, we propose to estimate the feature variance using an adaptation technique based on a discriminative criterion. In an experiment using the Aurora2 database, the proposed method could achieve significant digit error rate reduction compared with a spectral subtraction pre-processor, and using a discriminative criterion for adaptation provided further improvement compared with maximum likelihood estimation.

AB - The performance of automatic speech recognition suffers from severe degradation in the presence of noise or reverberation. One conventional approach for handling such acoustic distortions is to use a speech enhancement technique prior to recognition. However, most speech enhancement techniques introduce artifacts that create a mismatch between the enhanced speech features and the acoustic model used for recognition, therefore limiting the improvement in recognition performance. Recently, there has been increased interest in methods capable of compensating for such a mismatch by accounting for the feature variance during decoding. In this paper, we propose to estimate the feature variance using an adaptation technique based on a discriminative criterion. In an experiment using the Aurora2 database, the proposed method could achieve significant digit error rate reduction compared with a spectral subtraction pre-processor, and using a discriminative criterion for adaptation provided further improvement compared with maximum likelihood estimation.

KW - MMI

KW - Model Adaptation

KW - Noise reduction

KW - Robust ASR

KW - Variance Compensation

UR - http://www.scopus.com/inward/record.url?scp=79961165363&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79961165363&partnerID=8YFLogxK

U2 - 10.1109/HSCMA.2011.5942414

DO - 10.1109/HSCMA.2011.5942414

M3 - Conference contribution

SN - 9781457709999

SP - 7

EP - 12

BT - 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays, HSCMA'11

ER -