TY - GEN
T1 - Optimization of Sliding-DCT Based Gaussian Filtering for Hardware Accelerator
AU - Otsuka, Tomoki
AU - Fukushima, Norishige
AU - Maeda, Yoshihiro
AU - Sugimoto, Kenjiro
AU - Kamata, Sei Ichiro
N1 - Funding Information:
This work was supported by JSPS KAKENHI JP17H01764, 18K18076, 18K19813, 19K24368.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Gaussian filtering is a smoothing filter used in various tasks. The main disadvantage is the dependence of the processing time on its kernel radius. One solution is using a sliding-discreet cosine transform (DCT), a constant-time algorithm for the kernel radius, and it provides the best performance in terms of both speed and accuracy. However, the speed and accuracy differ according to the type of DCT used. We can also accelerate the sliding-DCT based Gaussian filter by hardware accelerators, but the acceleration requires modification of the algorithms. In this paper, we focus on the fused multiply-add (FMA) instruction of hardware accelerators in modern computer architectures. The FMA instruction simultaneously performs multiplication and addition, i.e.,ax+b. We proposed an acceleration method of the sliding-DCT based Gaussian filtering for the FMA instruction. Moreover, we evaluate the performance of it in terms of computational time and approximation accuracy.
AB - Gaussian filtering is a smoothing filter used in various tasks. The main disadvantage is the dependence of the processing time on its kernel radius. One solution is using a sliding-discreet cosine transform (DCT), a constant-time algorithm for the kernel radius, and it provides the best performance in terms of both speed and accuracy. However, the speed and accuracy differ according to the type of DCT used. We can also accelerate the sliding-DCT based Gaussian filter by hardware accelerators, but the acceleration requires modification of the algorithms. In this paper, we focus on the fused multiply-add (FMA) instruction of hardware accelerators in modern computer architectures. The FMA instruction simultaneously performs multiplication and addition, i.e.,ax+b. We proposed an acceleration method of the sliding-DCT based Gaussian filtering for the FMA instruction. Moreover, we evaluate the performance of it in terms of computational time and approximation accuracy.
KW - FMA
KW - Gaussian filter
KW - constant-time Gaussian filter
KW - sliding DCT
UR - http://www.scopus.com/inward/record.url?scp=85099477643&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099477643&partnerID=8YFLogxK
U2 - 10.1109/VCIP49819.2020.9301775
DO - 10.1109/VCIP49819.2020.9301775
M3 - Conference contribution
AN - SCOPUS:85099477643
T3 - 2020 IEEE International Conference on Visual Communications and Image Processing, VCIP 2020
SP - 423
EP - 426
BT - 2020 IEEE International Conference on Visual Communications and Image Processing, VCIP 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE International Conference on Visual Communications and Image Processing, VCIP 2020
Y2 - 1 December 2020 through 4 December 2020
ER -