Name: [Solved] CS596 Homework # 4
Brand: Assignment Chef
SKU: [Solved] CS596 Homework # 4
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

5/5 - (1 vote)

Problem 1: We would like to test the ability of the first kernel method (discussed in the class) to approximate pdfs. Therefore we generate 1000 realizations of a random variable uniformly distributed in [0,1]. a) Approximate the corresponding pdf using the Gaussian kernel

Plot the resulting approximations for different values of h. Do not limit your graph in [0,1] but extend it beyond the two boundaries. What do you observe as far as capturing the support of the random variable (the interval [0,1]) and the constant value of the pdf (equal to 1 in [0,1]) is concerned? b) Complete the same steps for the Laplacian kernel

Problem 2: The Matlab data file hw4-2data.mat contains two matrices: stars and circles each being a list of 2 dimensional (2-D) vectors. Each 2-D vector identifies a point in the 2-D space which is labeled either as star or circle. We are interested in developing a classifier that distinguishes between the two sets. We would like to use the kernel method to find a nonlinear separating boundary. To achieve this we assign the label 1 to stars and the label 1 to circles and if (X),X = [x₁,x₂]^|is the transformation we would like to apply to the data then we want to solve the following minimization problem to find the optimum (X)

, (1)

where V is the vector space of functions generated by the Gaussian kernel

K(X,Y ) = _e_h¹kXY k²= _e_h¹{(x₁y₁)²+(x₂y₂)²}.

Use the Representer theorem to prove that for the first two sums we can replace (X) by its orthogonal projection (X) onto the linear subspace generated by K(X,X_i),X_i stars and K(X,X_j),X_j circles where

(X) = ^X_iK(X,X_i) + ^X_jK(X,X_j). (2)

X_istars X_jcircles

For the term k(X)k²use the orthogonality principle to show that

k(X)k²= k(X)k²+ k(X) (X)k²k(X)k².

Does this suggest that we can replace (X) with (X) everywhere in the original cost function in (1)? If yes, then WHY? c) Using the form of (X) defined in (2), find the optimum coefficients _i,_j. d) Once you identify the optimum (X) explain how you are going to use it to classify a new point X_newas star or

circle given, of course, that (X_new) will not be exactly equal to 1 or 1. e) After you have specified your final classification rule in d) find (numerically) the separating boundary for the two classes in the 2-D space (also place the training points on the 2-D plane to verify the quality of your boundary). Repeat the process for different values of h and .

Problem 3: Suppose that the scalar random variable y and the random vector X are related through the following model

y = ^|X + w, (3)

where w is a scalar random variable with mean 0 and independent from X and a deterministic vector. Assuming that we have knowledge of the statistical behavior of the random pair (y,X) we would like to estimate by solving the following minimization problem

minE[(y ^|X)²]. (4)

a) Find the optimum and prove that it is equal to . b) Assume now that the statistical behavior of the pair (y,X) is not available and, instead, you are observing pairs (y_t,X_t) that are independent realizations of (y,X). Find the learning (adaptive) algorithm that provides a new estimate _tevery time a new pair (y_t,X_t) becomes available. The resulting algorithm is known as Least Mean Squares (LMS) and constitutes the most well known algorithm in Adaptive Signal Processing. c) Use your own favorite , of length 5, generate pairs (y_t,X_t) using the model in (3), where X_tis a Gaussian vector of length 5 with independent and identically distributed elements of mean 0 and variance 1 and w_tis again Gaussian independent from X_twith mean 0 and variance 0.1. Apply your algorithm from b) and verify its convergence towards . Select a small learning rate so that your estimates are not very noisy. Plot the squared norm k_t k²as a function of the iteration t and examine whether it converges to something small. Use logarithmic scale (semilogy in Matlab) to be able to observe differences between small quantities. d) Repeat the previous simulation but with a learning rate which is half the learning rate you used in question c). What do you observe as far as convergence rate and steady state error is concerned? Can you compare the performance in the two cases and claim that one is better than the other?

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] CS596 Homework # 4

Reviews

Related products

[Solved] CS596 Homework # 2

[Solved] CS596 Midterm

[Solved] CS596 Homework #1

[Solved] CS596 Homework # 3