Robustness of Pre-trained Language Models against Adversarial Attacks

Jānis Bērziņš; Elīna Kalniņa

Authors

Jānis Bērziņš Tilde, Riga, Latvia
Elīna Kalniņa Tilde, Riga, Latvia

Abstract

Pre-trained language models, such as BERT, GPT, and their derivatives, have revolutionized natural language processing (NLP) tasks. Despite their success, these models are vulnerable to adversarial attacks, which pose significant threats to their robustness and reliability. This paper explores the robustness of pre-trained language models against various types of adversarial attacks, examining both the nature of these attacks and the defenses that can be employed. We review existing literature, analyze the strengths and weaknesses of current approaches, and propose directions for future research to enhance the robustness of these models.

Robustness of Pre-trained Language Models against Adversarial Attacks

Authors

Abstract

Downloads

Published

Issue

Section

License

Current Issue

Information