TY - GEN
T1 - Time-frequency-bin-wise beamformer selection and masking for speech enhancement in underdetermined noisy scenarios
AU - Yamaoka, Kouei
AU - Brendel, Andreas
AU - Ono, Nobutaka
AU - Makino, Shoji
AU - Buerger, Michael
AU - Yamada, Takeshi
AU - Kellermann, Walter
N1 - Funding Information:
This work was supported by the Japan Society for the Promotion of Science (JSPS) under Grant 16H01735, SECOM Science and Technology Foundation, and the Tsukuba-DAAD Joint Research Program.
Publisher Copyright:
© EURASIP 2018.
PY - 2018/11/29
Y1 - 2018/11/29
N2 - In this paper, we present a speech enhancement method using two microphones for underdetermined situations. A conventional speech enhancement method for underdetermined situations is time-frequency masking, where speech is enhanced by multiplying zero or one to each time-frequency component appropriately. Extending this method, we switch multiple preconstructed beamformers at each time-frequency bin, each of which suppresses a particular interferer. This method can suppress an interferer even when both the target and an interferer are simultaneously active at a given time-frequency bin. As a switching criterion, selection of minimum value of the outputs of the all beamformers at each time-frequency bin is investigated. Additionally, another method using direction of arrival estimation is also investigated. In experiments, we confirmed that the proposed methods were superior to conventional time-frequency masking and fixed beamforming in the performance of speech enhancement.
AB - In this paper, we present a speech enhancement method using two microphones for underdetermined situations. A conventional speech enhancement method for underdetermined situations is time-frequency masking, where speech is enhanced by multiplying zero or one to each time-frequency component appropriately. Extending this method, we switch multiple preconstructed beamformers at each time-frequency bin, each of which suppresses a particular interferer. This method can suppress an interferer even when both the target and an interferer are simultaneously active at a given time-frequency bin. As a switching criterion, selection of minimum value of the outputs of the all beamformers at each time-frequency bin is investigated. Additionally, another method using direction of arrival estimation is also investigated. In experiments, we confirmed that the proposed methods were superior to conventional time-frequency masking and fixed beamforming in the performance of speech enhancement.
UR - http://www.scopus.com/inward/record.url?scp=85059802769&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059802769&partnerID=8YFLogxK
U2 - 10.23919/EUSIPCO.2018.8553299
DO - 10.23919/EUSIPCO.2018.8553299
M3 - Conference contribution
AN - SCOPUS:85059802769
T3 - European Signal Processing Conference
SP - 1582
EP - 1586
BT - 2018 26th European Signal Processing Conference, EUSIPCO 2018
PB - European Signal Processing Conference, EUSIPCO
T2 - 26th European Signal Processing Conference, EUSIPCO 2018
Y2 - 3 September 2018 through 7 September 2018
ER -