Abstract:
In the past decade, deep learning has been ubiquitous across diverse fields like
natural language processing (NLP), computer vision, speech processing, etc. De-
spite achieving state-of-the-art performance, there are ongoing concerns regarding
robustness and explainability of deep-learning systems. These concerns have fur-
ther gained traction due to the presence of adversarial examples which make such
systems behave in an undesirable fashion. To this end, this thesis explores several
adversarial attacks and defenses for deep-learning based vision and NLP systems.
For vision/vision-and-language systems, the following two problems are studied in
this thesis: (i) Robustness of visual question answering (VQA) systems: We study
the robustness of VQA systems to adversarial background noise. The results show
that, by adding minimal background noise, such systems can be easily fooled to
predict an answer of the same as well as different category as the original answer.
(ii) Task-agnostic adversarial attack for vision systems: We propose a task-agnostic
adversarial attack named Mimic and Fool and show its effectiveness against vision
systems designed for different tasks like image classification, image captioning and
VQA. While the attack relies on the information loss that occurs in a convolutional
neural network, we show that invertible architectures such as i-RevNet are also
vulnerable to the proposed attack.
For NLP systems, the following three problems are studied in this thesis: (i)
Invariance-based attack against neural machine translation (NMT) systems: We
explore the robustness of NMT systems to non-sensical inputs obtained via an
invariance-based attack. Unlike previous adversarial attacks against NMT sys-
tem which make minimal changes to the source sentence in order to change the
predicted translation, the invariance-based attack makes multiple changes in the
source sentence with the goal of keeping the predicted translation unchanged.
(ii) Defense against invariance-based attack: The non-sensical inputs obtained
via the invariance-based attack do not have a ground truth translation. This
makes standard adversarial training as a defense strategy infeasible. In this con-
text, we explore several defense strategies to counteract the invariance-based at-
tack. (iii) Robustness of multiple choice question-answering (MCQ) systems and
intervention-based study: We explore the robustness of MCQ systems against the
invariance-based attack. Furthermore, we also study the generalizability of MCQ
systems to different types of interventions on the input paragraph.