Name: [Solved] CSC421/2516 Homework 3
Brand: Assignment Chef
SKU: [Solved] CSC421/2516 Homework 3
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

5/5 - (1 vote)

Submission: You must submit your solutions as a PDF through MarkUs. You can produce the file however you like (e.g. LaTeX, Microsoft Word, scanner) as long as it is readable.

Late Submission: MarkUs will remain open until 3 days after the deadline, after which no late submissions will be accepted. The late penalty is 10% per day, rounded up.

Weekly homeworks are individual work. See the Course Information handout^[1] for detailed policies.

[5pts] For this question, you may wish to review the properties of expectation and variance: https://metacademy.org/graphs/concepts/expectation_and_variance

Dropout has an interesting interpretation in the case of linear regression. Recall that the predictions are made stochastically as:

y = Xmjwjxj,

where the m_js are all i.i.d. (independnet and identically distributed) Bernoulli random variables with expectation 1/2. (I.e., they are indepdendent for every input dimension and every data point.) We would like to minimize the cost

, (1)

where the expectation is with respect to thes.

Now we show that this is equivalent to a regularized linear regression problem:

(a) [2pts] Find expressions for E[y] and Var[y] for a given x and w. (b) [1pt] Determine w_jas a function of w_jsuch that

E[y] = y = ^Xw_jx_j.

Here, y can be thought of as (deterministic) predictions made by a different model.

where R is a function of the w_Ds which does not involve an expectation. I.e., give an expression for R. (Note that R will depend on the data, so we call it a data-dependent regularizer.)

Hint: write the cost in terms of the mean and variance formulas from part (a). For inspiration, you may wish to refer to the derivation of the bias/variance decomposition from the Lecture 12 course notes.

CSC421/2516 Winter 2019 Homework 3

Binary Addition [5pts] In this problem, you will implement a recurrent neural network which implements binary addition. The inputs are given as binary sequences, starting with the least significant binary digit. (It is easier to start from the least significant bit, just like how you did addition in grade school.) The sequences will be padded with at least one zero on the end. For instance, the problem

100111 + 110010 = 1011001

would be represented as:

Input 1: 1, 1, 1, 0, 0, 1, 0
Input 2: 0, 1, 0, 0, 1, 1, 0
Correct output: 1, 0, 0, 1, 1, 0, 1

There are two input units corresponding to the two inputs, and one output unit. Therefore, the pattern of inputs and outputs for this example would be:

Design the weights and biases for an RNN which has two input units, three hidden units, and one output unit, which implements binary addition. All of the units use the hard threshold activation function. In particular, specify weight matrices U, V, and W, bias vector b_h, and scalar bias b_yfor the following architecture:

Hint: In the grade school algorithm, you add up the values in each column, including the carry. Have one of your hidden units activate if the sum is at least 1, the second one if it is at least 2, and the third one if it is 3.

[1] http://www.cs.toronto.edu/~rgrosse/courses/csc421_2019/syllabus.pdf

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] CSC421/2516 Homework 3

Reviews

Related products

[Solved] CSC421/2516 Homework 4

[Solved] CSC421/2516 Homework 5

[Solved] CSC421 Programming Assignment 3: Attention-Based Neural Machine Translation

[Solved] CSC421/2516 Homework 2

[Solved] CSC421/2516 Programming Assignment 2: Convolutional Neural Networks

[Solved] CSC421/2516 Homework 1