5/5 - (1 vote)

Softmax

Name: [Solved] CSE676 Assignment #1-Softmax
Brand: Assignment Chef
SKU: [Solved] CSE676 Assignment #1-Softmax
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

Prove that softmax is invariant to constant sifts in the input, e., for any input vector x and a constant scalar c, the following holds:

softmax(x) = softmax(x+c) ,

where softmax(x), and x+c means adding c to every dimension of x.

Let z = Wx+c, where W and c are some matrix and vector, respectively. Let

J = ^Xlogsoftmax(z)_i.

Calculate the derivatives of J w.r.t. W and c, respectively, i.e., calculate ^J_Wand .

0.2 Logistic Regression with Regularization

[10 point] Let the data be (, where x_i R^dand y_i {0,1}. Logistic regression is a binary classification model, with the probability of y_ibeing 1 as:

where is the model parameter. Assume we impose an L₂regularization term on the parameter, defined as:

with a positive constant . Write out the final objective function for this logistic regression with regularization model.

[10 point] If we use gradient descent to solve the model parameter. Derive the updating rule for . Your answer should contain the derivation, not just the final answer.

0.3 Derivative of the Softmax Function

1) [10 point] Define the loss function as

J(z) = ^Xy_klog y_k,

k=1

where , and (y₁, ,y_K) is a known probability vector. Derive the .

Note z = (z₁, ,z_K) is a vector so ^J⁽_z^z⁾is in the form of a vector. Your answer should contain the derivation, not just the final answer.

CSE 676 Changyou Chen Spring 2021

2 [10 point] Assume the above softmax is the output layer of an FNN. Briefly explain how the derivative is used in the backpropagation algorithm.

3) [10 points] Let z = W^Th+b, where W is a matrix, b and h are vectors. Use the chain rule to calculate the gradient of W and b, i.e., ^J_Wand, respectively.

0.4 MNIST with FNN

1) [30 points] Design an FNN for MNIST classification. Implement the model and plot two curves in one figure: i) training loss vs. training iterations; ii) test loss vs. training iterations.

You can use online code. However, you must reference (cite) the code in your answer.
Submission includes the plot of the two curves and the runnable code (with a ReadMe file containing instructions on how to run the code).

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] CSE676 Assignment #1-Softmax

Softmax

0.2 Logistic Regression with Regularization

0.3 Derivative of the Softmax Function

0.4 MNIST with FNN

Reviews

Related products

[Solved] CSE676 Project2

[Solved] CSE676 Project1 -Introduction to Deep Learning Neural Networks