Deep Griffin-Lim Iteration

Yoshiki Masuyama, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada

研究成果: Conference contribution

3 引用 (Scopus)

抄録

This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processing-based approach and a deep neural network (DNN). To retrieve a time-domain signal from its amplitude spectrogram, the corresponding phase is required. One of the popular phase reconstruction methods is the Griffin-Lim algorithm (GLA), which is based on the redundancy of the short-time Fourier transform. However, GLA often involves many iterations and produces low-quality signals owing to the lack of prior knowledge of the target signal. In order to address these issues, in this study, we propose an architecture which stacks a sub-block including two GLA-inspired fixed layers and a DNN. The number of stacked sub-blocks is adjustable, and we can trade the performance and computational load based on requirements of applications. The effectiveness of the proposed method is investigated by reconstructing phases from amplitude spectrograms of speeches.

元の言語English
ホスト出版物のタイトル2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
出版者Institute of Electrical and Electronics Engineers Inc.
ページ61-65
ページ数5
ISBN(電子版)9781479981311
DOI
出版物ステータスPublished - 2019 5 1
イベント44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom
継続期間: 2019 5 122019 5 17

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2019-May
ISSN(印刷物)1520-6149

Conference

Conference44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
United Kingdom
Brighton
期間19/5/1219/5/17

Fingerprint

Redundancy
Fourier transforms
Signal processing
Deep neural networks

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

これを引用

Masuyama, Y., Yatabe, K., Koizumi, Y., Oikawa, Y., & Harada, N. (2019). Deep Griffin-Lim Iteration. : 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings (pp. 61-65). [8682744] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; 巻数 2019-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2019.8682744

Deep Griffin-Lim Iteration. / Masuyama, Yoshiki; Yatabe, Kohei; Koizumi, Yuma; Oikawa, Yasuhiro; Harada, Noboru.

2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 61-65 8682744 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; 巻 2019-May).

研究成果: Conference contribution

Masuyama, Y, Yatabe, K, Koizumi, Y, Oikawa, Y & Harada, N 2019, Deep Griffin-Lim Iteration. : 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings., 8682744, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 巻. 2019-May, Institute of Electrical and Electronics Engineers Inc., pp. 61-65, 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, Brighton, United Kingdom, 19/5/12. https://doi.org/10.1109/ICASSP.2019.8682744
Masuyama Y, Yatabe K, Koizumi Y, Oikawa Y, Harada N. Deep Griffin-Lim Iteration. : 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 61-65. 8682744. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2019.8682744
Masuyama, Yoshiki ; Yatabe, Kohei ; Koizumi, Yuma ; Oikawa, Yasuhiro ; Harada, Noboru. / Deep Griffin-Lim Iteration. 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 61-65 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{f75fc9c3839e4c60b47d3f3388492d33,
title = "Deep Griffin-Lim Iteration",
abstract = "This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processing-based approach and a deep neural network (DNN). To retrieve a time-domain signal from its amplitude spectrogram, the corresponding phase is required. One of the popular phase reconstruction methods is the Griffin-Lim algorithm (GLA), which is based on the redundancy of the short-time Fourier transform. However, GLA often involves many iterations and produces low-quality signals owing to the lack of prior knowledge of the target signal. In order to address these issues, in this study, we propose an architecture which stacks a sub-block including two GLA-inspired fixed layers and a DNN. The number of stacked sub-blocks is adjustable, and we can trade the performance and computational load based on requirements of applications. The effectiveness of the proposed method is investigated by reconstructing phases from amplitude spectrograms of speeches.",
keywords = "deep neural network, Phase reconstruction, residual learning, spectrogram consistency",
author = "Yoshiki Masuyama and Kohei Yatabe and Yuma Koizumi and Yasuhiro Oikawa and Noboru Harada",
year = "2019",
month = "5",
day = "1",
doi = "10.1109/ICASSP.2019.8682744",
language = "English",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "61--65",
booktitle = "2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings",

}

TY - GEN

T1 - Deep Griffin-Lim Iteration

AU - Masuyama, Yoshiki

AU - Yatabe, Kohei

AU - Koizumi, Yuma

AU - Oikawa, Yasuhiro

AU - Harada, Noboru

PY - 2019/5/1

Y1 - 2019/5/1

N2 - This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processing-based approach and a deep neural network (DNN). To retrieve a time-domain signal from its amplitude spectrogram, the corresponding phase is required. One of the popular phase reconstruction methods is the Griffin-Lim algorithm (GLA), which is based on the redundancy of the short-time Fourier transform. However, GLA often involves many iterations and produces low-quality signals owing to the lack of prior knowledge of the target signal. In order to address these issues, in this study, we propose an architecture which stacks a sub-block including two GLA-inspired fixed layers and a DNN. The number of stacked sub-blocks is adjustable, and we can trade the performance and computational load based on requirements of applications. The effectiveness of the proposed method is investigated by reconstructing phases from amplitude spectrograms of speeches.

AB - This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processing-based approach and a deep neural network (DNN). To retrieve a time-domain signal from its amplitude spectrogram, the corresponding phase is required. One of the popular phase reconstruction methods is the Griffin-Lim algorithm (GLA), which is based on the redundancy of the short-time Fourier transform. However, GLA often involves many iterations and produces low-quality signals owing to the lack of prior knowledge of the target signal. In order to address these issues, in this study, we propose an architecture which stacks a sub-block including two GLA-inspired fixed layers and a DNN. The number of stacked sub-blocks is adjustable, and we can trade the performance and computational load based on requirements of applications. The effectiveness of the proposed method is investigated by reconstructing phases from amplitude spectrograms of speeches.

KW - deep neural network

KW - Phase reconstruction

KW - residual learning

KW - spectrogram consistency

UR - http://www.scopus.com/inward/record.url?scp=85068966043&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068966043&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2019.8682744

DO - 10.1109/ICASSP.2019.8682744

M3 - Conference contribution

AN - SCOPUS:85068966043

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 61

EP - 65

BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -