TY - JOUR
T1 - Deep Neural Backdoor in Semi-Supervised Learning
T2 - Threats and Countermeasures
AU - Yan, Zhicong
AU - Wu, Jun
AU - Li, Gaolei
AU - Li, Shenghong
AU - Guizani, Mohsen
N1 - Funding Information:
This work was supported in part by the National Natural Science Foundation of China under Grant 61971283, Grant U20B2048, and Grant 61972255; in part by Shanghai Sailing Program under Grant 21YF1421700; and in part by Shanghai Municipal Science and Technology Major Project under Grant 2021SHZDZX0102.
Publisher Copyright:
© 2005-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - Semi-Supervised Learning (SSL) is a powerful derivative for humans to discover the hidden knowledge, and will be a great substitute for data taggers. Although the availability of unlabeled data rises up a huge passion to SSL, the untrustness of unlabeled data leads to many unknown security risks. In this paper, we first identify an insidious backdoor threat of SSL where unlabeled training data are poisoned by backdoor methods migrated from supervised settings. Then, to further exploit this threat, a Deep Neural Backdoor (DeNeB) scheme is proposed, which requires less data poisoning budgets and produces stronger backdoor effectiveness. By poisoning a fraction of our unlabeled training data, the DeNeB achieves the illegal manipulation on the trained model without modifying the training process. Finally, an efficient detection-and-purification defense (DePuD) framework is proposed to thwart the proposed scheme. In DePuD, we construct a deep detector to locate trigger patterns in the unlabeled training data, and perform secured SSL training with purified unlabeled data where the detected trigger patterns are obfuscated. Extensive experiments based on benchmark datasets are performed to demonstrate the huge threatening of DeNeB and the effectiveness of DePuD. To the best of our knowledge, this is the first work to achieve the backdoor and its defense in semi-supervised learning.
AB - Semi-Supervised Learning (SSL) is a powerful derivative for humans to discover the hidden knowledge, and will be a great substitute for data taggers. Although the availability of unlabeled data rises up a huge passion to SSL, the untrustness of unlabeled data leads to many unknown security risks. In this paper, we first identify an insidious backdoor threat of SSL where unlabeled training data are poisoned by backdoor methods migrated from supervised settings. Then, to further exploit this threat, a Deep Neural Backdoor (DeNeB) scheme is proposed, which requires less data poisoning budgets and produces stronger backdoor effectiveness. By poisoning a fraction of our unlabeled training data, the DeNeB achieves the illegal manipulation on the trained model without modifying the training process. Finally, an efficient detection-and-purification defense (DePuD) framework is proposed to thwart the proposed scheme. In DePuD, we construct a deep detector to locate trigger patterns in the unlabeled training data, and perform secured SSL training with purified unlabeled data where the detected trigger patterns are obfuscated. Extensive experiments based on benchmark datasets are performed to demonstrate the huge threatening of DeNeB and the effectiveness of DePuD. To the best of our knowledge, this is the first work to achieve the backdoor and its defense in semi-supervised learning.
KW - Semi-supervised learning
KW - backdoor
KW - detection-and-purification
KW - unlabeled data
UR - http://www.scopus.com/inward/record.url?scp=85118221452&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118221452&partnerID=8YFLogxK
U2 - 10.1109/TIFS.2021.3116431
DO - 10.1109/TIFS.2021.3116431
M3 - Article
AN - SCOPUS:85118221452
VL - 16
SP - 4827
EP - 4842
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
SN - 1556-6013
ER -