[Solved] CSE417: Introduction to Machine Learning Problem Set 1

$25

File Name: CSE417:_Introduction_to_Machine_Learning_Problem_Set_1.zip
File Size: 508.68 KB

SKU: [Solved] CSE417: Introduction to Machine Learning Problem Set 1 Category: Tag:
5/5 - (1 vote)

Problems

  1. LFD Problem 1.3 2. Out of textbook 3. LFD Problem 1.7 4. LFD Problem 1.8

CSE417: Introduction to Machine Learning

Problem Set 1

Q1: LFD Problem 1.3

(a)

Since w is the optimal set of weights, for all 1 n N, wT xn must have same sign as yn since a linear separation is achieved.

Therefore, for all 1 n N, yn(wT xn) > 0, which suggests min1nNyn(wT xn) > 0, or > 0

(b)

Given the update rule, w(t + 1) = w(t) + y(t)x(t), transpose both sides and multiply by w:

w(t + 1)T = w(t)T + (y(t)x(t))T

w(t + 1)T w = w(t)T w + (y(t)x(t))T w

w(t + 1)T w = w(t)T w + y(t)(x(t)T w) given y(t) is 1 * 1

w(t + 1)T w = w(t)T w + y(t)(wT x(t)) given wT x(t) is 1*1 so its symmetric

And by the definition of = min1nNyn(wT xn), for all 1 t N, we have rho y(t)(wT x(t).

Then we could conclude that:

w(t + 1)T w w(t)T w +

Base Case: t = 0, since we assumed w(0) = 0, we have:

wT (0)w = 0 0 = 0

Induction: for t = n, given that the inequality is valid:

wT (n)w n

and incorporate the inequality we deduced above:

wT (n + 1)w w(n)T w + wT (n + 1)w n + wT (n + 1)w (n + 1)

so the inequality must also be valid for t = n + 1. Thus proved.

(c)

Given the update rule, w(t+1) = w(t)+y(t)x(t), transpose both sides and multiply by the original equation:

w(t + 1) w(t + 1)T = (w(t) + (y(t)x(t)))(w(t)T + (y(t)x(t))T )

||w(t + 1)||2 = ||w(t)||2 + 2y(t)x(t)w(t)T + y(t)2||x(t)||2

Since for any y(t), we have y(t)2 = 1, and 2y(t)x(t)w(t)T < 0 given y(t) represents a misclassified point, we have:

||w(t + 1)||2 ||w(t)||2 + y(t)2||x(t)||2 ||w(t + 1)||2 ||w(t)||2 + ||x(t)||2

(d)

Proof:

Base case: for t = 0, ||w(t)||2 = 0 0 R2 = 0

Induction:

Given that ||w(t)||2 tR2, we have:

||w(t + 1)||2 ||w(t)||2 + ||x(t 1)||2

||w(t + 1)||2 ||w(t)||2 + R2 ||w(t + 1)||2 ||w(t)||2 + R2

||w(t + 1)||2 ||w(t)||2 + R2

||w(t + 1)||2 tR2 + R2

||w(t + 1)||2 (t + 1)R2

2

(e)

Proof: Using the conclusion in part (b):

given the conclusion in (d)

Since where represents the angle between wT (t) and w, we must have

1, then we could rewrite the inequality to be:

Given That t 0

2

Q2

The first plot is the number of operation for each iteration and the second plot is a histogram of log difference. Plot attached below:

Q3: LFD Problem 1.7

(a)

For = 0.05

P[0|10,0.05] = 0.9510 = 0.5987

1 Coin: P = 0.5987

10 Coins: P = 1 (1 0.9510)10 = 0.9998

1000 Coins: P = 1 (1 0.9510)1000 = 1

1000000 Coins: P = 1 (1 0.9510)1000000 = 1

For = 0.8

P[0,|10,0.8] = 0.210 = 1.024e 7

1 Coin: P = 1.024e-7

10 Coins: P = 1 (1 0.210)10 = 0.000001024

1000 Coins: P = 1 (1 0.210)1000 = .0001024

1000000 Coins: P = 1 (1 0.210)1000000 = 0.09733

(b)

The result should be equivalent to P = 1 ) for each , which produces a step function satisfy the following condition:

Using Hoeffding bound for 2 coins,

Plot attached below:

Q4: LFD Problem 1.8

(a)

By definition, we have that Then for each > 0, we have:

Proved.

(b)

By definition, E[(u )2] = 2, then using the conclusion in part (a), we have:

(c)

By definition, for N iid random variables (u1,u2,un) each with E(ui) = and variance V ar(ui) = 2 for 0 < i n

Letwe have that:

And that:

Then similar to that in part (b):

proved.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] CSE417: Introduction to Machine Learning Problem Set 1
$25