Rectified linear unit can assist griffin-lim phase recovery

Kohei Yatabe, Yoshiki Masuyama, Yasuhiro Oikawa

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    4 Citations (Scopus)

    Abstract

    Phase recovery is an essential process for reconstructing a time-domain signal from the corresponding spectrogram when its phase is contaminated or unavailable. Recently, a phase recovery method using deep neural network (DNN) was proposed, which interested us because the inverse short-time Fourier transform (inverse STFT) was utilized within the network. This inverse STFT converts a spectrogram into its time-domain counterpart, and then the activation function, leaky rectified linear unit (ReLU), is applied. Such nonlinear operation in time domain resembles the speech enhancement method called the harmonic regeneration noise reduction (HRNR). In HRNR, a time-domain nonlinearity, typically ReLU, is applied for assistance in enhancing the higher-order harmonics. From this point of view, one question arose in our mind: Can time-domain ReLU solely assist phase recovery? Inspired by this curious connection between the recent DNN-based phase recovery method and HRNR in speech enhancement, the ReLU assisted Griffin-Lim algorithm is proposed in this paper to investigate the above question. Through an experiment of speech denoising with the oracle Wiener filter, some positive effect of the time-domain nonlinearity is confirmed in terms of the scores of the short-time objective intelligibility (STOI).

    Original languageEnglish
    Title of host publication16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages555-559
    Number of pages5
    ISBN (Electronic)9781538681510
    DOIs
    Publication statusPublished - 2018 Nov 2
    Event16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Tokyo, Japan
    Duration: 2018 Sep 172018 Sep 20

    Other

    Other16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018
    CountryJapan
    CityTokyo
    Period18/9/1718/9/20

    Fingerprint

    Noise abatement
    recovery
    Recovery
    Speech enhancement
    Fourier transforms
    noise reduction
    regeneration
    harmonics
    spectrograms
    Chemical activation
    nonlinearity
    intelligibility
    augmentation
    Experiments
    activation
    Deep neural networks
    filters

    Keywords

    • Consistency
    • Harmonic regeneration
    • Redundancy
    • Spectrogram
    • Time-domain nonlinearity

    ASJC Scopus subject areas

    • Signal Processing
    • Acoustics and Ultrasonics

    Cite this

    Yatabe, K., Masuyama, Y., & Oikawa, Y. (2018). Rectified linear unit can assist griffin-lim phase recovery. In 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings (pp. 555-559). [8521304] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IWAENC.2018.8521304

    Rectified linear unit can assist griffin-lim phase recovery. / Yatabe, Kohei; Masuyama, Yoshiki; Oikawa, Yasuhiro.

    16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2018. p. 555-559 8521304.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Yatabe, K, Masuyama, Y & Oikawa, Y 2018, Rectified linear unit can assist griffin-lim phase recovery. in 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings., 8521304, Institute of Electrical and Electronics Engineers Inc., pp. 555-559, 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018, Tokyo, Japan, 18/9/17. https://doi.org/10.1109/IWAENC.2018.8521304
    Yatabe K, Masuyama Y, Oikawa Y. Rectified linear unit can assist griffin-lim phase recovery. In 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2018. p. 555-559. 8521304 https://doi.org/10.1109/IWAENC.2018.8521304
    Yatabe, Kohei ; Masuyama, Yoshiki ; Oikawa, Yasuhiro. / Rectified linear unit can assist griffin-lim phase recovery. 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 555-559
    @inproceedings{7befb32d76114c0bb064dd7be95d9bcd,
    title = "Rectified linear unit can assist griffin-lim phase recovery",
    abstract = "Phase recovery is an essential process for reconstructing a time-domain signal from the corresponding spectrogram when its phase is contaminated or unavailable. Recently, a phase recovery method using deep neural network (DNN) was proposed, which interested us because the inverse short-time Fourier transform (inverse STFT) was utilized within the network. This inverse STFT converts a spectrogram into its time-domain counterpart, and then the activation function, leaky rectified linear unit (ReLU), is applied. Such nonlinear operation in time domain resembles the speech enhancement method called the harmonic regeneration noise reduction (HRNR). In HRNR, a time-domain nonlinearity, typically ReLU, is applied for assistance in enhancing the higher-order harmonics. From this point of view, one question arose in our mind: Can time-domain ReLU solely assist phase recovery? Inspired by this curious connection between the recent DNN-based phase recovery method and HRNR in speech enhancement, the ReLU assisted Griffin-Lim algorithm is proposed in this paper to investigate the above question. Through an experiment of speech denoising with the oracle Wiener filter, some positive effect of the time-domain nonlinearity is confirmed in terms of the scores of the short-time objective intelligibility (STOI).",
    keywords = "Consistency, Harmonic regeneration, Redundancy, Spectrogram, Time-domain nonlinearity",
    author = "Kohei Yatabe and Yoshiki Masuyama and Yasuhiro Oikawa",
    year = "2018",
    month = "11",
    day = "2",
    doi = "10.1109/IWAENC.2018.8521304",
    language = "English",
    pages = "555--559",
    booktitle = "16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",

    }

    TY - GEN

    T1 - Rectified linear unit can assist griffin-lim phase recovery

    AU - Yatabe, Kohei

    AU - Masuyama, Yoshiki

    AU - Oikawa, Yasuhiro

    PY - 2018/11/2

    Y1 - 2018/11/2

    N2 - Phase recovery is an essential process for reconstructing a time-domain signal from the corresponding spectrogram when its phase is contaminated or unavailable. Recently, a phase recovery method using deep neural network (DNN) was proposed, which interested us because the inverse short-time Fourier transform (inverse STFT) was utilized within the network. This inverse STFT converts a spectrogram into its time-domain counterpart, and then the activation function, leaky rectified linear unit (ReLU), is applied. Such nonlinear operation in time domain resembles the speech enhancement method called the harmonic regeneration noise reduction (HRNR). In HRNR, a time-domain nonlinearity, typically ReLU, is applied for assistance in enhancing the higher-order harmonics. From this point of view, one question arose in our mind: Can time-domain ReLU solely assist phase recovery? Inspired by this curious connection between the recent DNN-based phase recovery method and HRNR in speech enhancement, the ReLU assisted Griffin-Lim algorithm is proposed in this paper to investigate the above question. Through an experiment of speech denoising with the oracle Wiener filter, some positive effect of the time-domain nonlinearity is confirmed in terms of the scores of the short-time objective intelligibility (STOI).

    AB - Phase recovery is an essential process for reconstructing a time-domain signal from the corresponding spectrogram when its phase is contaminated or unavailable. Recently, a phase recovery method using deep neural network (DNN) was proposed, which interested us because the inverse short-time Fourier transform (inverse STFT) was utilized within the network. This inverse STFT converts a spectrogram into its time-domain counterpart, and then the activation function, leaky rectified linear unit (ReLU), is applied. Such nonlinear operation in time domain resembles the speech enhancement method called the harmonic regeneration noise reduction (HRNR). In HRNR, a time-domain nonlinearity, typically ReLU, is applied for assistance in enhancing the higher-order harmonics. From this point of view, one question arose in our mind: Can time-domain ReLU solely assist phase recovery? Inspired by this curious connection between the recent DNN-based phase recovery method and HRNR in speech enhancement, the ReLU assisted Griffin-Lim algorithm is proposed in this paper to investigate the above question. Through an experiment of speech denoising with the oracle Wiener filter, some positive effect of the time-domain nonlinearity is confirmed in terms of the scores of the short-time objective intelligibility (STOI).

    KW - Consistency

    KW - Harmonic regeneration

    KW - Redundancy

    KW - Spectrogram

    KW - Time-domain nonlinearity

    UR - http://www.scopus.com/inward/record.url?scp=85057416166&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85057416166&partnerID=8YFLogxK

    U2 - 10.1109/IWAENC.2018.8521304

    DO - 10.1109/IWAENC.2018.8521304

    M3 - Conference contribution

    SP - 555

    EP - 559

    BT - 16th International Workshop on Acoustic Signal Enhancement, IWAENC 2018 - Proceedings

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -