Machine Learning and AI: Cybersecurity and Threat Hunting

Defense Against Adversarial Attacks for Neural Representations of Text

Huixin Zhan, Texas Tech UniversityFollow
Kun Zhang, Xavier University of LouisianaFollow
Zhong Chen, Xavier University of LouisianaFollow
Victor Sheng, Texas Tech UniversityFollow

Location

Hilton Hawaiian Village, Honolulu, Hawaii

Event Website

https://hicss.hawaii.edu/

Start Date

3-1-2024 12:00 AM

End Date

6-1-2024 12:00 AM

Description

In this paper, we focus on defending against adversarial attacks for privacy-preserving Natural Language Processing (NLP) under a model partitioning scenario, where the model splits into a local, on-device part and a remote, cloud-based part. Model partitioning improves the scalability and protects the privacy of inputs into the model. However, we argue that privacy protection breaks during inference with model partitioning. In this paper, an adversary eavesdrops on the hidden representations output from the local devices and tries to use the representations to obtain private information from the input text. We study two types of adversarial attacks, i.e., adversarial classification and adversarial generation. Based on these two attack models, we correspondingly propose two defenses: defending the adversarial classification (DAC) and defending the adversarial generation (DAG). Specifically, the DAC and DAG approaches are both bilevel optimization-based defense methods. Both methods optimally modify a subpopulation of the neural representations that are subject to maximally decreasing the adversary’s ability. The representations trained with this bilevel optimization protect sensitive information from the adversary attack while maintaining their utility for downstream tasks. Our experiments show that both DAC and DAG approaches improve the performance of the main text classifier and achieve even higher privacy of neural representations compared with the current state-of-the-art methods.

Recommended Citation

Zhan, Huixin; Zhang, Kun; Chen, Zhong; and Sheng, Victor, "Defense Against Adversarial Attacks for Neural Representations of Text" (2024). Hawaii International Conference on System Sciences 2024 (HICSS-57). 4.
https://aisel.aisnet.org/hicss-57/st/threat_hunting/4

Download

COinS

Jan 3rd, 12:00 AM Jan 6th, 12:00 AM

Defense Against Adversarial Attacks for Neural Representations of Text

Hilton Hawaiian Village, Honolulu, Hawaii

https://aisel.aisnet.org/hicss-57/st/threat_hunting/4

Machine Learning and AI: Cybersecurity and Threat Hunting

Defense Against Adversarial Attacks for Neural Representations of Text

Location

Event Website

Start Date

End Date

Description

Recommended Citation

Search

Browse

Author Corner

Machine Learning and AI: Cybersecurity and Threat Hunting

Defense Against Adversarial Attacks for Neural Representations of Text

Presenter Information

Location

Event Website

Start Date

End Date

Description

Recommended Citation

Share

Search

Browse

Author Corner