DEFENSE AGAINST WHITE BOX ADVERSARIAL ATTACKS IN ARABIC NATURAL LANGUAGE PROCESSING (ANLP)
Abstract views: 63 / PDF downloads: 159
DOI:
https://doi.org/10.59287/ijanser.1149Keywords:
Adversarial Attack, Multi-Layer Perceptron, Sentiment Analysis, Text Classification, Social Media, Natural Language ProcessAbstract
Adversarial attacks are among the biggest threats that affect the accuracy of classifiers in machine learning systems. This type of attacks tricks the classification model and make it perform false predictions by providing noised data that only human can detect that noise. The risk of attacks is high in natural language processing applications because most of the data collected in this case is taken from social networking sites that do not impose any restrictions on users when writing comments, which allows the attack to be created (either intentionally or unintentionally) easily and simply affecting the level of accuracy of the model. In this paper, The MLP model was used for the sentiment analysis of the texts taken from the tweets, the effect of applying a white-box adversarial attack on this classifier was studied and a technique was proposed to protect it from the attack. After applying the proposed methodology, we found that the adversarial attack decreases the accuracy of the classifier from 55.17% to 11.11%, and after applying the proposed defense technique, this contributed to an increase in the accuracy of the classifier up to 77.77%, and therefore the proposed plan can be adopted in the face of the adversarial attack. Attacker determines their targets strategically and deliberately depend on vulnerabilities they have ascertained. Organization and individuals mostly try to protect themselves from one occurrence or type on an attack. Still, they have to acknowledge that the attacker may easily move focus to advanced uncovered vulnerabilities. Even if someone successfully tackles several attacks, risks remain, and the need to face threats will happen for the predictable future.