5/5 - (1 vote)

Problems

LFD Problem 1.3 2. Out of textbook 3. LFD Problem 1.7 4. LFD Problem 1.8

CSE417: Introduction to Machine Learning

Problem Set 1

Q1: LFD Problem 1.3

(a)

Since wis the optimal set of weights, for all 1 n N, w^Tx_nmust have same sign as y_nsince a linear separation is achieved.

Therefore, for all 1 n N, y_n(w^Tx_n) > 0, which suggests min_1nNy_n(w^Tx_n) > 0, or > 0

(b)

Given the update rule, w(t + 1) = w(t) + y(t)x(t), transpose both sides and multiply by w:

w(t + 1)^T= w(t)^T+ (y(t)x(t))^T

w(t + 1)^Tw= w(t)^Tw+ (y(t)x(t))^Tw

w(t + 1)^Tw= w(t)^Tw+ y(t)(x(t)^Tw) given y(t) is 1 * 1

w(t + 1)^Tw= w(t)^Tw+ y(t)(w^Tx(t)) given w^Tx(t) is 1*1 so its symmetric

And by the definition of = min_1nNy_n(w^Tx_n), for all 1 t N, we have rho y(t)(w^Tx(t).

Then we could conclude that:

w(t + 1)^Tw w(t)^Tw+

Base Case: t = 0, since we assumed w(0) = 0, we have:

w^T(0)w= 0 0 = 0

Induction: for t = n, given that the inequality is valid:

w^T(n)w n

and incorporate the inequality we deduced above:

w^T(n + 1)w w(n)^Tw+ w^T(n + 1)w n + w^T(n + 1)w (n + 1)

so the inequality must also be valid for t = n + 1. Thus proved.

(c)

Given the update rule, w(t+1) = w(t)+y(t)x(t), transpose both sides and multiply by the original equation:

w(t + 1) w(t + 1)^T= (w(t) + (y(t)x(t)))(w(t)^T+ (y(t)x(t))^T)

||w(t + 1)||²= ||w(t)||²+ 2y(t)x(t)w(t)^T+ y(t)²||x(t)||²

Since for any y(t), we have y(t)²= 1, and 2y(t)x(t)w(t)^T< 0 given y(t) represents a misclassified point, we have:

||w(t + 1)||² ||w(t)||²+ y(t)²||x(t)||²||w(t + 1)||² ||w(t)||²+ ||x(t)||²

(d)

Proof:

Base case: for t = 0, ||w(t)||²= 0 0 R²= 0

Induction:

Given that ||w(t)||² tR², we have:

||w(t + 1)||² ||w(t)||²+ ||x(t 1)||²

||w(t + 1)||² ||w(t)||²+ R²||w(t + 1)||² ||w(t)||²+ R²

||w(t + 1)||² ||w(t)||²+ R²

||w(t + 1)||² tR²+ R²

||w(t + 1)||² (t + 1)R²

(e)

Proof: Using the conclusion in part (b):

given the conclusion in (d)

Since where represents the angle between w^T(t) and w, we must have

1, then we could rewrite the inequality to be:

Given That t 0

Q2

The first plot is the number of operation for each iteration and the second plot is a histogram of log difference. Plot attached below:

Q3: LFD Problem 1.7

(a)

For = 0.05

P[0|10,0.05] = 0.95¹⁰= 0.5987

1 Coin: P = 0.5987

10 Coins: P = 1 (1 0.95¹⁰)¹⁰= 0.9998

1000 Coins: P = 1 (1 0.95¹⁰)¹⁰⁰⁰= 1

1000000 Coins: P = 1 (1 0.95¹⁰)^1000000= 1

For = 0.8

P[0,|10,0.8] = 0.2¹⁰= 1.024e 7

1 Coin: P = 1.024e-7

10 Coins: P = 1 (1 0.2¹⁰)¹⁰= 0.000001024

1000 Coins: P = 1 (1 0.2¹⁰)¹⁰⁰⁰= .0001024

1000000 Coins: P = 1 (1 0.2¹⁰)^1000000= 0.09733

(b)

The result should be equivalent to P = 1 ) for each , which produces a step function satisfy the following condition:

Using Hoeffding bound for 2 coins,

Plot attached below:

Q4: LFD Problem 1.8

(a)

By definition, we have that Then for each > 0, we have:

Proved.

(b)

By definition, E[(u )²] = ², then using the conclusion in part (a), we have:

(c)

By definition, for N iid random variables (u₁,u₂,u_n) each with E(u_i) = and variance V ar(u_i) = ²for 0 < i n

Letwe have that:

And that:

Then similar to that in part (b):

proved.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] CSE417: Introduction to Machine Learning Problem Set 1

Problems

Q1: LFD Problem 1.3

(a)

(b)

(c)

(d)

(e)

Q2

Q3: LFD Problem 1.7

(a)

(b)

Q4: LFD Problem 1.8

(a)

(b)

(c)

Reviews

Whatsapp Us

[Solved] CSE417: Introduction to Machine Learning Problem Set 1

Problems

Q1: LFD Problem 1.3

(a)

(b)

(c)

(d)

(e)

Q2

Q3: LFD Problem 1.7

(a)

(b)

Q4: LFD Problem 1.8

(a)

(b)

(c)

Reviews

Related products

[Solved] CSE417 Assignment 4 -The 1-dimensional closest pair of points problem

[Solved] CSE417 Assignment 7 RNA secondary structure prediction

[Solved] CSE417 Assignment 2 -articulation points algorithm