, , , , ,

[SOLVED] Csci 4260/6260 assignment 1: adversarial attack

$25

File Name: Csci_4260_6260_assignment_1__adversarial_attack.zip
File Size: 442.74 KB

5/5 - (1 vote)

Problem 1. Classify the following acts as a violation of confidentiality, of integrity, of availability,
or of some combination of three. You should clearly state any assumptions you make.
(i) John and his friends take CS department’s nike server offline by sending massive amount of
network packets.
(ii) Alice breaks into UGA’s eLC web server and changes her grade for HW1.
(iii) Bob obtains his classmate’s myid and password.
(iv) A ransomware encrypts the victim’s hard drive with a secret key, and a criminal asks for
money in exchange for decryption.
Note: The answers to the above questions are not unique and hence you need to clearly explain
why you think the act violates a certain principle as best as you can.
1
Fast Gradient Sign Method
Recall the fast gradient sign method (FGSM) we learned in class. Let x
orig ∈ R
3×H×W denote an
image of width W, height H, and with 3 color channels R, G, and B and y
orig ∈ {1, 2, . . . , K} be its
true label. Suppose we have a pre-trained neural network fθ : R
3×H×W → {1, 2, . . . , K} that takes
an image as input and returns a label. FGSM formulates the problem of generating an adversarial
example x
adv from the given image x
orig as an optimization problem:
x
adv = arg max
x
L(fθ(x), yorig)
subject to ∥x − x
orig∥∞ ≤ ϵ ,
(1)
where L is a loss function, ϵ > 0 is a small constant, and ∥·∥∞ denotes L∞ norm.
Problem 2. Answer the following questions. You should elaborate the rationale behind your
answer using mathematical symbols and equations whenever possible.
(a) Notice that the problem (1) enforces the constraint that the magnitude of perturbation (or
noise) x − x
orig is small using the L∞ norm. Read this Wikipedia article and give the mathematical definition of L∞ norm. Explain in detail what the role of this L∞ norm constraint
is in the adversarial attack.
(b) Suppose that you mistakenly change the L∞ norm constraint in (1) into L2 norm constraint.
What do you think how this changed formulation affects x
adv in terms of perturbation (or
noise)? Your answer should be based on the observed difference between L∞ and L2 norms.
(c) Write the equation for updating the adversarial example x
adv using the FGSM. Your answer
should be written using the symbols and notations used in our lecture slides.
(d) Is it possible to generate an adversarial image that incurs 0 loss. That is, does there exist an
adversarial example x
adv such that L(fθ(x
adv), yorig) = 0?
(e) Notice that the above is a formulation for untargeted attack. Suggest a way to modify the
optimization problem in (1) to perform the targeted attack. In other words, provide an
equation of optimization problem for a targeted attack. You can denote the target class by
y
target
.
Problem 3. Suppose that your (hypothetical) TA decided to construct an adversarial attack without using a reference image, i.e., without access to x
orig and came up with the following formulation.
x
adv = arg min
x
1
2
∥fθ(x) − y
target∥
2
2
,
where y
target is the class to which the TA wants to misguide fθ. Do you think the resulting image,
the solution of the above optimization problem, look like a natural image? You should justify your
answer in detail.
DeepFool
Recall that the DeepFool method uses an orthogonal projection for generating adversarial example.
Specifically, given an image to attack x
orig, it projects x
orig onto the decision boundary of a classifier.
Problem 4. (graduate only) Figure 1 shows a linear line f corresponding to a linear classifier
and an image x
orig in 2D space. Derive an equation corresponding to the L2 distance between
x
orig and x
proj. You need to show your derivation step by step.
2
x1
x2
f(x1, x2) = w1x1 + w2x2 + b = w⊺x + b = 0
w
x
orig
x
proj
Figure 1: A linear classifier f(x) = 0
Programming Questions
Generating Adversarial Examples
Consider a neural network fθ : R
784 → R
10 consisting of three linear layers. For this assignment, we
will use the MNIST dataset, an image dataset of handwritten digits (for more detailed description
of this dataset, please refer to this page). Figure 2 shows the visualization of the first 10 images in
the training dataset.
Figure 2: Visualization of the first 10 examples in MNIST dataset
Each image x is a matrix of size 28 × 28 but converted into a vector (by concatenating its row
vectors) x ∈ R
784 to be used as input to a linear layer. The neural network generates the output
3
as follows:
input: x ∈ R
784
z
(1) = W(1)x + b
(1)
, W(1) ∈ R
512×784
, b
(1) ∈ R
512 (2)
a
(1) = sigmoid(z
(1)) applied element-wise (3)
z
(2) = W(2)a
(1) + b
(2)
, W(2) ∈ R
512×512
, b
(2) ∈ R
512 (4)
a
(2) = sigmoid(z
(2)) applied element-wise (5)
z
(3) = W(3)a
(2) + b
(3)
, W(3) ∈ R
10×512
, b
(3) ∈ R
10 (6)
a
(3) = softmax(z
(3)), (7)
where, for a vector z = (z1, . . . , zn)

,
sigmoid(z) = 
1
1 + e−z1
,
1
1 + e−z2
, . . . ,
1
1 + e−zn
⊺
and
softmax(z) =
e
z1
Pn
j=1 e
zj
,
e
z2
Pn
j=1 e
zj
, . . . ,
e
zn
Pn
j=1 e
zj
!⊺
.
The (model) parameters are θ = {W(1)
, b
(1)
,W(2)
, b
(2)
,W(3)
, b
(3)}. Notice that, given an image
x ∈ R
784, the network fθ(x) computes its (intermediate) outputs according to (2) – (7) and returns
a vector a
(3) ∈ R
10, the i
th entry of which corresponds to the class probability P[Y = i | X = x] for
class i. Since a
(3) is the class probabilities estimated by our neural network model, let’s denote it
by yb = a
(3). To measure how close this estimated probability is to the ground truth (i.e., labels),
we use the following loss function, known as cross-entropy.
L(θ; x, y) := −
X
10
i=1
yi
log ˆyi
,
where ˆyi denotes the i
th entry of yb. The network was trained on the MNIST dataset for 100
epochs using the stochastic gradient descent algorithm with mini-batch size=256 and step size=0.1.
Figure 3 shows the performance of trained network against epoch numbers.
Figure 3: The train/test performance of network over epochs
You can download the trained (model) parameters from here. It is also included in the base code
zip archive. For your convenience, the trained neural network object was serialized using Python’s
pickle library. To learn how to use the pickle class, see the examples in this page. The essential
functionality of pickle class is to serialize object into or de-serialize from the file. For example,
to load the trained model from the file “trained model.pkl”, use the following code fragment.
4
1 with open(‘trained_model.pkl’, ‘rb’) as fid:
2 model = pickle.load(fid)
3
4 # now you can use the model class
Similarly, for your convenience, the serialized MNIST dataset is provided here.
1 with open(‘mnist.pkl’, ‘rb’) as fid:
2 dataset = pickle.load(fid)
3
4 # dataset is a dictionary containing the training and test sets.
The keys and values of dataset object is described in Table 1.
Key Value
’training images’ numpy array of size (60, 000 × 784)
’training labels’ numpy array of size (60, 000,)
’test images’ numpy array of size (10, 000 × 784)
’test labels’ numpy array of size (10, 000,)
Table 1: Dataset directionary
Here are some important functions of MLP class in the provided base code.
• model.forward(x) : This function takes a batch of images and performs the forward operation and returns the output, the estimated class probabilities for images in x. Here x is a
2D numpy array, each row corresponding to an image. Mathematically, it returns
yb = fθ(x).
• model.backward(y) : This function implements the famous Backpropagation algorithm. It
computes the gradients recursively and returns three lists.
– grad_w : gradient w.r.t. W, i.e., it contains h
∂L
∂W(1) ,
∂L
∂W(2) ,
∂L
∂W(3) i
.
– grad_b : gradient w.r.t. b, i.e., it contains h
∂L
∂b(1) ,
∂L
∂b(2) ,
∂L
∂b(3) i
.
– delta : gradient w.r.t z, i.e., it contains h
∂L
∂z
(1) ,
∂L
∂z
(2) ,
∂L
∂z
(3) i
.
You won’t need to call this function directly because this function returns ∇θL (the
gradient of loss function w.r.t. model parameter θ) which is used for updating the model
parameter θ.
• grad_wrt_input(x, y) : This is the function you’ll be mainly using to generate your adversarial examples. It computes the gradient(s) w.r.t. the given input pair(s) (x, y). In other
words, it returns ∇xL(θ; x, y) = ∂L
∂x
.
To visualize your adversarial example, you can use visualize_example() function which has the
following input arguments.
• x_img : a numpy array corresponding to your adversarial image
5
• y_probs : class probabilities for x_img (estimated by fθ)
• b_unnormalize : if set True, the function unnoromalizes the given image by applying the
inverse of standardization. Fix this to True.
• label : an integer value corresponding to the label of x_img
• filename : a string with which the visualization is stored if provided. If filename=None, the
function displays the plot without storing it into a file.
Problem 5. In this problem, our goal is to generate a random image x
adv that our neural network
fθ believes it belongs to class y
target ∈ [0, 9].
• Generate a random image x0 ∈ R
784 each of whose entry is randomly drawn from the uniform
distribution Uniform(−α, α). Set α = 0.1.
• Given x0, our goal is to gradually modify xt until the neural network fθ recognizes it as an
instance of class y
target using the gradient descent algorithm.
• The fact that xt
is recognized as an instance of class y
target means that the loss L(θ; xt
, y)
takes the minimum value when y = y
target. For your understanding, think about why this is
the case. Since x0 is randomly initialized, at the first iteration x0 will not be classified into
class y
target, which means L(θ; xt
, ytarget) is high. Hence, our goal is to gradually change xt
such that L(θ; xt
, ytarget) decreases. In other words, we will solve the following optimization
problem using the gradient descent algorithm.
x
adv = arg min
x
L(θ; xt
, ytarget)
• Repeat the gradient decent update until xt
is classified into y
target
• Implement the following function.
1 def generate_image_for_class(model, target_class):
2 “””
3 This function Generates a random image that will be classified
4 as class target_class by the neural network.
5
6 Parameters:
7 ————————————
8 model: neural network model object
9 target_class: integer, target_class to which the network classifies the image
10 alpha: each pixel in the image is initialized by sampling from
11 uniform distribution over (-alpha, alpha)
12 “””
13 # —————————————-#
14 # Your code goes here #
15 # —————————————-#
16
17
18 def main():
19 model = None
20 with open(‘trained_model.pkl’, ‘rb’) as fid:
21 model = pickle.load(fid)
22
23 for c in range(10):
6
24 generate_image_for_class(model, c)
25
26
27 if __name__ == “__main__”:
28 main()
29
– Your function should store the generated image as ‘‘targeted random img class [y target].png’’.
– Generate the images for y
target = 0, 1, 2, . . . , 9.
– If necessary, you can change the prototypes of above functions.
– Your generated image will not look like any digit (recall that you started with a random
image in which each pixel value is drawn from a uniform distribution), yet your neural
network will classify it as a digit belonging to class y
target
.
Problem 6. In this problem, we will perform an untargeted attack using the Fast Gradient Sign
Method (FGSM). This means that we don’t have any specific target class y
target in mind. The
attack is regarded as successful as long as the network misclassifies the given image.
• The first example, denoted by x
test in your test set is an instance of class 7 (i.e., y
test = 7).
• Use the first example in the test set as a starting point. That is, your x0 = x
test
.
• Using the FGSM algorithm, perturb xt until it is not classified as class y
test
.
• In other words, solve the following optimization problem.
x
adv = argmax
x
L(θ; x
test, ytest)
subject to ∥x − x
orig∥∞ ≤ ϵ ,
where ϵ = 0.05.
Store the generated input as “FGSM untargeted.png” using visualize_example() function.
1 def fgsm(x_test, y_test, model, eps=0.05):
2 # ——————————–#
3 # Your code goes here #
4 # ——————————–#
5
6 return x
7
8
9 def main():
10 # load datasets
11 mnist = None
12 with open(‘mnist.pkl’, ‘rb’) as fid:
13 mnist = pickle.load(fid)
14
15 # load model
16 model = None
17 with open(‘trained_model.pkl’, ‘rb’) as fid:
18 model = pickle.load(fid)
19
20 # ——————————–#
7
21 # Your code goes here #
22 # ——————————–#
23
24 x_adv = fgsm(x_test, y_test, model)
25 # visualize x adv
26
27 if __name__ == “__main__”:
28 main()
8

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] Csci 4260/6260 assignment 1: adversarial attack[SOLVED] Csci 4260/6260 assignment 1: adversarial attack
$25