Sound Event Localization and Detection (SELD) is a task of simultaneously identifying sound events and their locations. The existing methods perform SELD in the off-line setting using deep neural networks (DNNs) including bi-directional recurrent neural network (BiRNN). Although their effectiveness has been shown in the literature, they cannot be directly applied to real-time applications which requires on-line execution of SELD, i.e., the input signals must be successively processed with small latency. In this paper, we propose on-line extension of the off-line SELD systems and discuss about the essential latency of an on-line SELD system. The relationship between the system latency and accuracy of SELD was investigated by experiments. From the experimental results, we confirmed that on-line extension of the SELD system maintains or improves the performance of localization, while event detection performance is degraded in low-latency.
ASJC Scopus subject areas