Associative Memory Model-Based Linear Filtering and Its Application to Tandem Connectionist Blind Source Separation

    Research output: Contribution to journalArticle

    2 Citations (Scopus)

    Abstract

    We propose a blind source separation method that yields high-quality speech with low distortion. Time-frequency (TF) masking can effectively reduce interference, but it produces nonlinear distortion. By contrast, linear filtering using a separation matrix such as independent vector analysis (IVA) can avoid nonlinear distortion, but the separation performance is reduced under reverberant conditions. The tandem connectionist approach combines several separation methods and it has been used frequently to compensate for the disadvantages of these methods. In this study, we propose associative memory model (AMM)-based linear filtering and a tandem connectionist framework, which applies TF masking followed by linear filtering. By using AMM trained with speech spectra to optimize the separation matrix, the proposed linear filtering method considers the properties of speech that are not considered explicitly in IVA, such as the harmonic components of spectra. TF masking is applied in the proposed tandem connectionist framework to reduce unwanted components that hinder the optimization of the separation matrix, and it is approximated by using a linear separation matrix to reduce nonlinear distortion. The results obtained in simultaneous speech separation experiments demonstrate that although the proposed linear filtering method can increase the signal-to-distortion ratio (SDR) and signal-to-interference ratio (SIR) compared with IVA, the proposed tandem connectionist framework can obtain greater increases in SDR and SIR, and it reduces the phoneme error rate more than the proposed linear filtering method.

    Original languageEnglish
    Article number7819470
    Pages (from-to)637-650
    Number of pages14
    JournalIEEE/ACM Transactions on Audio Speech and Language Processing
    Volume25
    Issue number3
    DOIs
    Publication statusPublished - 2017 Mar 1

    Fingerprint

    Linear Filtering
    associative memory
    Blind source separation
    Blind Source Separation
    Associative Memory
    Memory Model
    Linear Models
    Model-based
    Data storage equipment
    Nonlinear Distortion
    Nonlinear distortion
    vector analysis
    Masking
    interference
    masking
    Interference
    matrices
    phonemes
    Error Rate
    Harmonic

    Keywords

    • Blind source separation
    • independent vector analysis
    • neural network
    • speech recognition

    ASJC Scopus subject areas

    • Signal Processing
    • Media Technology
    • Instrumentation
    • Acoustics and Ultrasonics
    • Linguistics and Language
    • Speech and Hearing
    • Electrical and Electronic Engineering

    Cite this

    @article{2f8f902e0ffe4500b822fe4d0cfd5b02,
    title = "Associative Memory Model-Based Linear Filtering and Its Application to Tandem Connectionist Blind Source Separation",
    abstract = "We propose a blind source separation method that yields high-quality speech with low distortion. Time-frequency (TF) masking can effectively reduce interference, but it produces nonlinear distortion. By contrast, linear filtering using a separation matrix such as independent vector analysis (IVA) can avoid nonlinear distortion, but the separation performance is reduced under reverberant conditions. The tandem connectionist approach combines several separation methods and it has been used frequently to compensate for the disadvantages of these methods. In this study, we propose associative memory model (AMM)-based linear filtering and a tandem connectionist framework, which applies TF masking followed by linear filtering. By using AMM trained with speech spectra to optimize the separation matrix, the proposed linear filtering method considers the properties of speech that are not considered explicitly in IVA, such as the harmonic components of spectra. TF masking is applied in the proposed tandem connectionist framework to reduce unwanted components that hinder the optimization of the separation matrix, and it is approximated by using a linear separation matrix to reduce nonlinear distortion. The results obtained in simultaneous speech separation experiments demonstrate that although the proposed linear filtering method can increase the signal-to-distortion ratio (SDR) and signal-to-interference ratio (SIR) compared with IVA, the proposed tandem connectionist framework can obtain greater increases in SDR and SIR, and it reduces the phoneme error rate more than the proposed linear filtering method.",
    keywords = "Blind source separation, independent vector analysis, neural network, speech recognition",
    author = "Motoi Omachi and Tetsuji Ogawa and Tetsunori Kobayashi",
    year = "2017",
    month = "3",
    day = "1",
    doi = "10.1109/TASLP.2017.2653941",
    language = "English",
    volume = "25",
    pages = "637--650",
    journal = "IEEE/ACM Transactions on Speech and Language Processing",
    issn = "2329-9290",
    publisher = "IEEE Advancing Technology for Humanity",
    number = "3",

    }

    TY - JOUR

    T1 - Associative Memory Model-Based Linear Filtering and Its Application to Tandem Connectionist Blind Source Separation

    AU - Omachi, Motoi

    AU - Ogawa, Tetsuji

    AU - Kobayashi, Tetsunori

    PY - 2017/3/1

    Y1 - 2017/3/1

    N2 - We propose a blind source separation method that yields high-quality speech with low distortion. Time-frequency (TF) masking can effectively reduce interference, but it produces nonlinear distortion. By contrast, linear filtering using a separation matrix such as independent vector analysis (IVA) can avoid nonlinear distortion, but the separation performance is reduced under reverberant conditions. The tandem connectionist approach combines several separation methods and it has been used frequently to compensate for the disadvantages of these methods. In this study, we propose associative memory model (AMM)-based linear filtering and a tandem connectionist framework, which applies TF masking followed by linear filtering. By using AMM trained with speech spectra to optimize the separation matrix, the proposed linear filtering method considers the properties of speech that are not considered explicitly in IVA, such as the harmonic components of spectra. TF masking is applied in the proposed tandem connectionist framework to reduce unwanted components that hinder the optimization of the separation matrix, and it is approximated by using a linear separation matrix to reduce nonlinear distortion. The results obtained in simultaneous speech separation experiments demonstrate that although the proposed linear filtering method can increase the signal-to-distortion ratio (SDR) and signal-to-interference ratio (SIR) compared with IVA, the proposed tandem connectionist framework can obtain greater increases in SDR and SIR, and it reduces the phoneme error rate more than the proposed linear filtering method.

    AB - We propose a blind source separation method that yields high-quality speech with low distortion. Time-frequency (TF) masking can effectively reduce interference, but it produces nonlinear distortion. By contrast, linear filtering using a separation matrix such as independent vector analysis (IVA) can avoid nonlinear distortion, but the separation performance is reduced under reverberant conditions. The tandem connectionist approach combines several separation methods and it has been used frequently to compensate for the disadvantages of these methods. In this study, we propose associative memory model (AMM)-based linear filtering and a tandem connectionist framework, which applies TF masking followed by linear filtering. By using AMM trained with speech spectra to optimize the separation matrix, the proposed linear filtering method considers the properties of speech that are not considered explicitly in IVA, such as the harmonic components of spectra. TF masking is applied in the proposed tandem connectionist framework to reduce unwanted components that hinder the optimization of the separation matrix, and it is approximated by using a linear separation matrix to reduce nonlinear distortion. The results obtained in simultaneous speech separation experiments demonstrate that although the proposed linear filtering method can increase the signal-to-distortion ratio (SDR) and signal-to-interference ratio (SIR) compared with IVA, the proposed tandem connectionist framework can obtain greater increases in SDR and SIR, and it reduces the phoneme error rate more than the proposed linear filtering method.

    KW - Blind source separation

    KW - independent vector analysis

    KW - neural network

    KW - speech recognition

    UR - http://www.scopus.com/inward/record.url?scp=85013054724&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85013054724&partnerID=8YFLogxK

    U2 - 10.1109/TASLP.2017.2653941

    DO - 10.1109/TASLP.2017.2653941

    M3 - Article

    VL - 25

    SP - 637

    EP - 650

    JO - IEEE/ACM Transactions on Speech and Language Processing

    JF - IEEE/ACM Transactions on Speech and Language Processing

    SN - 2329-9290

    IS - 3

    M1 - 7819470

    ER -