1 Problem 1
Cross-validation for Polynomial Fitting: Consider the problem of fitting a polynomial function, Assume we wish to find a one dimensional function that takes a scalar input and outputs a scalar f R R. The function has the form
Az.: 0) + 91 r + 92 + + Odrd
where d is the degree of the polynomial_ Develop code that. finds the 0 which minimizes the risk fr 71,0) 1v, 1
=l
on a data-set. To help you get started. download the Matlab code in polyreg.m (on the tutorial web page) to do polynomial curve fitting. Use your code on the dataset probleml.mat. This should include a matrix x, corresponding to the scalar features {xi, . xN}, and a matrix y, corresponding to the scalar labels { yi, yN}. Fit a polynomial model to this data for various choices for d, the degree of the polynomial.
Which value(s) of d seems somewhat more reasonable? Please justify your answer using some empirical measure.
It is easy to overfit the data when using polynomial regression. As a result, use cross-validation by randomly splitting the data-set into two halves to select the complexity of the model (in this case, the degree of the polynomial). Include a plot showing the training and testing risk across various choices of d, and plot your f (,r; 0) overlaid on the data for the best choice of d according to cross-validation .
2 Problem 2
Regularized risk minimization: Modify the Matlab code for polyreg.m such that it learns a multivariate regression function f : R > R, where the basis functions are of the form
k
f (x; 0) = E oixi
i=1
The data-set is available in problem2.mat. As before, the x variable contains {x1, , xN} and the y variable contains their scalar labels {y1, , yN}.
Use an /2 loss function to penalize the complexity of the model, e.g. minimize the risk
Rreg (0) = | 1 N1 A N E 2 (yi f(x; OW + 2NMeVi=1 |
Use two-fold cross validation (as in Problem 1) to find the best value for A. Include a plot showing training and testing risk across various choices of A. A reasonable range for this data set would be from A = 0 to A = 1000. Also, mark the A which minimizes the testing error on the data set.
What do you notice about the training and testing error?
3 Problem 3
Logistic Squashing Function. The logistic squashing function is given by g(z) = 1 / (1 + exp(z)). Show that it satisfies the property g(z) = 1 g(z). Also show that its inverse is given by g-1(y) = ln(y/(1 y).
4 Problem 4
Logistic Regression: Implement a linear logistic regression algorithm for binary classification in Matlab using gradient descent. Your code should accept a dataset {(xi, yi), , (xN, yN)} where xi E Rd and yi E {0,1} and find a parameter vector 0 E Rd for the classification function
f (x; 0) = (1 + exp(-9Tx)) 1
which minimizes the empirical risk with logistic loss
N
1
Remp(e) = N V (yi 1) log(1 f (xi; 0)) yi log( f (xi; 0)).
L,i=1
Since you are using gradient descent, you will have to specify the step size 71 and the tolerance e. Pick reasonable values for 71 and e to then use your code to learn a classification function for the dataset in dataset4.mat. Type load dataset4 and you will have the variables X (input vectors) and Y (binary labels) in your Matlab environment which contain the dataset.
Show any derivations you need to make for this algorithm.
Use the whole data set as training. Show with figures the resulting linear decision boundary on the 2D X data. Show the binary classification error and the empirical risk you obtained throughout the run from random initialization until convergence. Note the number of iterations needed for your choice of ri and e.
Reviews
There are no reviews yet.