Spatial filter calibration based on minimization of modified LSD

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    A new sound source separation method has been developed that is robust against individual variability in microphones and acoustic lines. A specific area that has a target sound source was enhanced by using a spatial filter developed by time-frequency masking. However, there is a strong likelihood that the spatial filters will be distorted due to the impact of individual variability in microphone characteristics and acoustic lines. To solve this problem, calibration of these spatial filters' shapes was attempted using a modified log-spectral distance (MLSD) minimization criterion, which uses utterances made by each individual (i.e., a sound source) at the desired positions. The effectiveness of this spatial filter calibration was experimentally verified in speech recognition experiments; MLSD-based calibration had fewer word errors than the cases without calibration and calibration using other criteria.

    Original languageEnglish
    Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
    Pages1761-1764
    Number of pages4
    Publication statusPublished - 2011
    Event12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy
    Duration: 2011 Aug 272011 Aug 31

    Other

    Other12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011
    CountryItaly
    CityFlorence
    Period11/8/2711/8/31

    Fingerprint

    Calibration
    Filter
    Telephone lines
    Acoustic waves
    Microphones
    Acoustics
    Source separation
    Source Separation
    Line
    Masking
    Speech Recognition
    Speech recognition
    Likelihood
    Target
    Experiment
    Sound
    Experiments
    Spectrality

    Keywords

    • Modified LSD
    • Sound source separation
    • Spatial filter calibration
    • Time-frequency masking

    ASJC Scopus subject areas

    • Language and Linguistics
    • Human-Computer Interaction
    • Signal Processing
    • Software
    • Modelling and Simulation

    Cite this

    Tanaka, N., Ogawa, T., & Kobayashi, T. (2011). Spatial filter calibration based on minimization of modified LSD. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 1761-1764)

    Spatial filter calibration based on minimization of modified LSD. / Tanaka, Nobuaki; Ogawa, Tetsuji; Kobayashi, Tetsunori.

    Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2011. p. 1761-1764.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Tanaka, N, Ogawa, T & Kobayashi, T 2011, Spatial filter calibration based on minimization of modified LSD. in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. pp. 1761-1764, 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, Florence, Italy, 11/8/27.
    Tanaka N, Ogawa T, Kobayashi T. Spatial filter calibration based on minimization of modified LSD. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2011. p. 1761-1764
    Tanaka, Nobuaki ; Ogawa, Tetsuji ; Kobayashi, Tetsunori. / Spatial filter calibration based on minimization of modified LSD. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2011. pp. 1761-1764
    @inproceedings{b6dd4c4aa65b4d42be2441e8f71c8a46,
    title = "Spatial filter calibration based on minimization of modified LSD",
    abstract = "A new sound source separation method has been developed that is robust against individual variability in microphones and acoustic lines. A specific area that has a target sound source was enhanced by using a spatial filter developed by time-frequency masking. However, there is a strong likelihood that the spatial filters will be distorted due to the impact of individual variability in microphone characteristics and acoustic lines. To solve this problem, calibration of these spatial filters' shapes was attempted using a modified log-spectral distance (MLSD) minimization criterion, which uses utterances made by each individual (i.e., a sound source) at the desired positions. The effectiveness of this spatial filter calibration was experimentally verified in speech recognition experiments; MLSD-based calibration had fewer word errors than the cases without calibration and calibration using other criteria.",
    keywords = "Modified LSD, Sound source separation, Spatial filter calibration, Time-frequency masking",
    author = "Nobuaki Tanaka and Tetsuji Ogawa and Tetsunori Kobayashi",
    year = "2011",
    language = "English",
    pages = "1761--1764",
    booktitle = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

    }

    TY - GEN

    T1 - Spatial filter calibration based on minimization of modified LSD

    AU - Tanaka, Nobuaki

    AU - Ogawa, Tetsuji

    AU - Kobayashi, Tetsunori

    PY - 2011

    Y1 - 2011

    N2 - A new sound source separation method has been developed that is robust against individual variability in microphones and acoustic lines. A specific area that has a target sound source was enhanced by using a spatial filter developed by time-frequency masking. However, there is a strong likelihood that the spatial filters will be distorted due to the impact of individual variability in microphone characteristics and acoustic lines. To solve this problem, calibration of these spatial filters' shapes was attempted using a modified log-spectral distance (MLSD) minimization criterion, which uses utterances made by each individual (i.e., a sound source) at the desired positions. The effectiveness of this spatial filter calibration was experimentally verified in speech recognition experiments; MLSD-based calibration had fewer word errors than the cases without calibration and calibration using other criteria.

    AB - A new sound source separation method has been developed that is robust against individual variability in microphones and acoustic lines. A specific area that has a target sound source was enhanced by using a spatial filter developed by time-frequency masking. However, there is a strong likelihood that the spatial filters will be distorted due to the impact of individual variability in microphone characteristics and acoustic lines. To solve this problem, calibration of these spatial filters' shapes was attempted using a modified log-spectral distance (MLSD) minimization criterion, which uses utterances made by each individual (i.e., a sound source) at the desired positions. The effectiveness of this spatial filter calibration was experimentally verified in speech recognition experiments; MLSD-based calibration had fewer word errors than the cases without calibration and calibration using other criteria.

    KW - Modified LSD

    KW - Sound source separation

    KW - Spatial filter calibration

    KW - Time-frequency masking

    UR - http://www.scopus.com/inward/record.url?scp=84865714069&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84865714069&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:84865714069

    SP - 1761

    EP - 1764

    BT - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

    ER -