Name: [Solved] CSC411 Assignment 7-Representer Theorem
Brand: Assignment Chef
SKU: [Solved] CSC411 Assignment 7-Representer Theorem
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

5/5 - (1 vote)

Representer Theorem. In this question, youll prove and apply a simplified version of the Representer Theorem, which is the basis for a lot of kernelized algorithms. Consider a linear model:

z = w^>(x) y = g(z),

where is a feature map and g is some function (e.g. identity, logistic, etc.). We are given a training set . We are interested in minimizing the expected loss plus an L₂regularization term:

where L is some loss function. Let denote the feature matrix

Observe that this formulation captures a lot of the models weve covered in this course, including linear regression, logistic regression, and SVMs.

Show that the optimal weights must lie in the row space of .

Hint: Given a subspace S, a vector v can be decomposed as v = v_S+v, where v_Sis the projection of v onto S, and vis orthogonal to S. (You may assume this fact without proof, but you can review it here^[1].) Apply this decomposition to w and see if you can show something about one of the two components.

[3pts] Another way of stating the result from part (a) is that w = ^> for some vector . Hence, instead of solving for w, we can solve for . Consider the vectorized form of the L₂regularized linear regression cost function:

w .

Substitute in w = ^>, to write the cost function as a function of . Determine the optimal value of . Your answer should be an expression involving , t, and the Gram matrix K = ^>. For simplicity, you may assume that K is positive definite. (The algorithm still works if K is merely PSD, its just a bit more work to derive.)

Hint: the cost function J() is a quadratic function. Simplify the formula into the following form:

for some positive definite matrix A, vector b and constant c (which can be ignored). You may assume without proof that the minimum of such a quadratic function is given by = A¹b.

] Compositional Kernels. One of the most useful facts about kernels is that they can be composed using addition and multiplication. I.e., the sum of two kernels is a kernel, and the product of two kernels is a kernel. Well show this in the case of kernels which represent dot products between finite feature vectors.
- Suppose k₁(x,x⁰) = ₁(x)^>₁(x⁰) and k₂(x,x⁰) = ₂(x)^>₂(x⁰). Let k_Sbe the sum kernel k_S(x,x⁰) = k₁(x,x⁰)+k₂(x,x⁰). Find a feature map _Ssuch that k_S(x,x⁰) = _S(x)^>_S(x⁰).
- Suppose k₁(x,x⁰) = ₁(x)^>₁(x⁰) and k₂(x,x⁰) = ₂(x)^>₂(x⁰). Let k_Pbe the product kernel k_P(x,x⁰) = k₁(x,x⁰)k₂(x,x⁰). Find a feature map _Psuch that k_P(x,x⁰) = _P(x)^>_P(x⁰).

Hint: For inspiration, consider the quadratic kernel from Lecture 20, Slide 11.

[1] https://metacademy.org/graphs/concepts/projection_onto_a_subspace

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] CSC411 Assignment 7-Representer Theorem

Reviews

Related products

[Solved] CSC411 Assignment 3- Robust Regression

[Solved] CSC411 Project 1: Face Recognition and Gender Classification with Regression

[Solved] CSC411 Assignment 6-Mixture of Bernoullis model

[Solved] CSC411 Assignment 2-Information Theory

[Solved] CSC411 Assignment 5- Gaussian Discriminant Analysis

[Solved] CSC411 Assignment 4- AlexNet