TY - GEN
T1 - Proactive Detection of Query-based Adversarial Scenarios in NLP Systems
AU - Maghsoudimehrabani, Mohammad
AU - Azmoodeh, Amin
AU - Dehghantanha, Ali
AU - Zolfaghari, Behrouz
AU - Srivastava, Gautam
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/11/11
Y1 - 2022/11/11
N2 - Adversarial attacks can mislead a Deep Learning (DL) algorithm into generating erroneous predictions via feeding maliciously-disturbed inputs called adversarial examples. DL-based Natural Language Processing (NLP) algorithms are severely threatened by adversarial attacks. In real-world, black-box adversarial attacks, the adversary needs to submit many highly-similar queries before drafting an adversarial example. Due to this long process, in-progress attack detection can play a significant role in adversarial defense in DL-based NLP algorithms. Although there are several approaches for detecting adversarial attacks in NLP, these approaches are reactive in the sense that they can detect adversarial examples only when they are fabricated and fed into the algorithm. In this study, we take one step towards proactive detection of adversarial attacks in NLP systems by proposing a robust, history-based model named Stateful Query Analysis (SQA) to identify suspiciously-similar sequences of queries capable of generating textual adversarial examples to which we refer by adversarial scenarios. The model exhibits a detection rate of over 99.9% in our extensive experimental tests against several state-of-The-Art black-box adversarial attack methods.
AB - Adversarial attacks can mislead a Deep Learning (DL) algorithm into generating erroneous predictions via feeding maliciously-disturbed inputs called adversarial examples. DL-based Natural Language Processing (NLP) algorithms are severely threatened by adversarial attacks. In real-world, black-box adversarial attacks, the adversary needs to submit many highly-similar queries before drafting an adversarial example. Due to this long process, in-progress attack detection can play a significant role in adversarial defense in DL-based NLP algorithms. Although there are several approaches for detecting adversarial attacks in NLP, these approaches are reactive in the sense that they can detect adversarial examples only when they are fabricated and fed into the algorithm. In this study, we take one step towards proactive detection of adversarial attacks in NLP systems by proposing a robust, history-based model named Stateful Query Analysis (SQA) to identify suspiciously-similar sequences of queries capable of generating textual adversarial examples to which we refer by adversarial scenarios. The model exhibits a detection rate of over 99.9% in our extensive experimental tests against several state-of-The-Art black-box adversarial attack methods.
KW - adversarial attack detection
KW - natural language processing
KW - textual adversarial example
UR - http://www.scopus.com/inward/record.url?scp=85144018385&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144018385&partnerID=8YFLogxK
U2 - 10.1145/3560830.3563727
DO - 10.1145/3560830.3563727
M3 - Conference contribution
AN - SCOPUS:85144018385
T3 - AISec 2022 - Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security, co-located with CCS 2022
SP - 103
EP - 113
BT - AISec 2022 - Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security, co-located with CCS 2022
PB - Association for Computing Machinery, Inc
T2 - 15th ACM Workshop on Artificial Intelligence and Security, AISec 2022 - Co-located with CCS 2022
Y2 - 11 November 2022
ER -