Research Output
Evaluating Language Model Vulnerability to Poisoning Attacks in Low-Resource Settings
  Pre-trained language models are a highly effective source of knowledge transfer for natural language processing tasks, as their development represents an investment of resources beyond the reach of most researchers and end users. The widespread availability of such easily adaptable resources has enabled high levels of performance, which is especially valuable for low-resource language users who have typically been overlooked when it comes to NLP applications. However, these models introduce vulnerabilities in NLP toolchains, since they may prove vulnerable to attacks from malicious actors with access to the data used for downstream training. By perturbing instances from the training set, such attacks seek to undermine model capabilities and produce radically different outcomes during inference. We show that adversarial data manipulation has a severe effect on model performance, with BERT's performance dropping by more than 30% on average across all tasks at a poisoning ratio greater than 50%. Additionally, we conduct the first evaluation of this kind in the Basque language domain, establishing the vulnerability of low-resource models to the same form of attack.

  • Date:

    28 November 2024

  • Publication Status:

    Published

  • Publisher

    Institute of Electrical and Electronics Engineers (IEEE)

  • DOI:

  • ISSN:

    2329-9290

  • Funders:

    Engineering and Physical Sciences Research Council

Citation

Âé¶¹ÉçÇø

Plant, R., Giuffrida, M. V., Pitropakis, N., & Gkatzia, D. (2024). Evaluating Language Model Vulnerability to Poisoning Attacks in Low-Resource Settings. IEEE/ACM Transactions on Audio, Speech and Language Processing, 33, 54-67. https://doi.org/10.1109/taslp.2024.3507565

Authors

Keywords

Language modelling, machine learning methods for hlt, language understanding and computational semantics

Monthly Views:

Linked Projects

Available Documents