TY - GEN
T1 - Semantic Segmentation in Learned Compressed Domain
AU - Liu, Jinming
AU - Sun, Heming
AU - Katto, Jiro
N1 - Funding Information:
This paper is supported by Japan Science and Technology Agency (JST), under Grant JPMJPR19M5; Japan Society for the Promotion of Science (JSPS), under Grant 21K17770; Kenjiro Takayanagi Foundation; NICT, Grant Number 03801, Japan.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Most machine vision tasks (e.g., semantic segmentation) are based on images encoded and decoded by image compression algorithms (e.g., JPEG). However, these decoded images in the pixel domain introduce distortion, and they are optimized for human perception, making the performance of machine vision tasks suboptimal. In this paper, we propose a method based on the compressed domain to improve segmentation tasks. i) A dynamic and a static channel selection method are proposed to reduce the redundancy of compressed representations that are obtained by encoding. ii) Two different transform modules are explored and analyzed to help the compressed representation be transformed as the features in the segmentation network. The experimental results show that we can save up to 15.8% bitrates compared with a state-of-the-art compressed domain-based work while saving up to about 83.6% bitrates and 44.8% inference time compared with the pixel domain-based method.
AB - Most machine vision tasks (e.g., semantic segmentation) are based on images encoded and decoded by image compression algorithms (e.g., JPEG). However, these decoded images in the pixel domain introduce distortion, and they are optimized for human perception, making the performance of machine vision tasks suboptimal. In this paper, we propose a method based on the compressed domain to improve segmentation tasks. i) A dynamic and a static channel selection method are proposed to reduce the redundancy of compressed representations that are obtained by encoding. ii) Two different transform modules are explored and analyzed to help the compressed representation be transformed as the features in the segmentation network. The experimental results show that we can save up to 15.8% bitrates compared with a state-of-the-art compressed domain-based work while saving up to about 83.6% bitrates and 44.8% inference time compared with the pixel domain-based method.
KW - Channel selection
KW - Compressed domain
KW - Deep learning
KW - Image compression
UR - http://www.scopus.com/inward/record.url?scp=85147666503&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147666503&partnerID=8YFLogxK
U2 - 10.1109/PCS56426.2022.10018036
DO - 10.1109/PCS56426.2022.10018036
M3 - Conference contribution
AN - SCOPUS:85147666503
T3 - 2022 Picture Coding Symposium, PCS 2022 - Proceedings
SP - 181
EP - 185
BT - 2022 Picture Coding Symposium, PCS 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 Picture Coding Symposium, PCS 2022
Y2 - 7 December 2022 through 9 December 2022
ER -