Name: [Solved] CSCI5521-Homework 1
Brand: Assignment Chef
SKU: [Solved] CSCI5521-Homework 1
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

5/5 - (1 vote)

Consider doing least squares regression based on a training set Z_train= {(x^t,r^t),t = 1,,N}, where x^t R and r^t R.

(i) Consider fitting a linear model of the form

g₁(x) = w₁x + w₀,

with unknown parameters w₁,w₀ R, which are selected so as to minimize the following empirical loss:

t=1

Derive the optimal values of (w₁,w₀) clearly showing all steps of the derivation. (ii) Consider fitting a polynomial model of the form

g2(x) = v2x2020 + v1x + v0 ,

with unknown parameters v₂,v₁,v₀ R, which are selected so as to minimize the following empirical loss:

t=1

Derive the optimal values of v₂,v₁,v₀clearly showing all steps of the derivation.^[1]

(iii) For a given training set Z_train, let () be the optimal values of (w₁,w₀) in (i) above, and let () be the optimal values of (v₂,v₁,v₀) in (ii) above. Professor Gopher claims that the following is true for any given Z_train:

Is Professor Gophers claim correct? Clearly explain your answer.^[2]

Consider the following 4 4 matrix:

1 1 1 1

A = 1 2 4 8 .

1 3 9 27

1 4 16 64

What are the values of tr(A),tr(A^T),tr(A^TA), and tr(AA^T).³
From a geometric perspective, explain how the absolute value of |A| (determinant of A) can be computed.
Are the rows of A linearly independent? Clearly explain your answer.²

(For this problem, you can use python libraries to arrive at your answer. If you do that, clearly explain what you did and why. There is a way to arrive at the answer without using python libraries.)

Programming assignments: The next two problems involve programming. We will be considering three datasets (derived from two available datasets) for these assignments:

Boston: The Boston housing dataset comes pre-packaged with scikit-learn. The dataset has 506 points, 13 features, and 1 target (response) variable. You can find more information about the dataset here:

https://github.com/rupakc/UCI-Data-Analysis/tree/master/Boston Housing Dataset/Boston Housing

While the original dataset is for a regression problem, we will create two classification datasets for the homework. Note that you only need to work with the response r to create these classification datasets.

Boston50: Let ₅₀be the median (50th percentile) over all r (response) values. Create a 2-class classification problem such that y = 1 if r ₅₀and y = 0 if r < ₅₀. By construction, note that the class priors will be.
Boston75: Let ₇₅be the 75th percentile over all r (response) values. Create a 2-class classification problem such that y = 1 if r ₇₅and y = 0 if r < ₇₅. By construction, note that the class priors will be.

Digits: The Digits dataset comes prepackaged with scikit-learn. The dataset has 1797 points, 64 features, and 10 classes corresponding to ten numbers 0,1,, The dataset was (likely) created from the following dataset:

http://archive.ics.uci.edu/ml/datasets/Pen-Based+Recognition+of+Handwritten+Digits

The 2-class classification datasets from Boston50, Boston75, and the 10-class classification dataset from Digits will be used in the following two problems.

We will consider three methods from scikit-learn: LinearSVC, SVC, and LogisticRegression. Use the following parameters for the different methods mentioned:

LinearSVC: max iter=2000

SVC: gamma=scale, C=10

LogisticRegression: penalty=l2, solver=lbfgs, multi class=multinomial, max iter=5000

(i) Develop code for my cross val(method,X,y,k), which performs k-fold crossvalidation on (X,y) using method, and returns the error rate in each fold. Using my cross val, report the error rates in each fold as well as the mean and standard deviation of error rates across folds for the three methods: LinearSVC, SVC, and LogisticRegression, applied to the three classification datasets: Boston50, Boston75, and Digits.

You will have to submit (a) code and (b) summary of results for my cross val:

Code: You will have to submit code for my cross val(method,X,y,k) (main file) as well as a wrapper code q3i().

The main file has input: (1) method, which specifies the (class) name of one of the three classification methods under consideration, (2) X,y, which is data for the 2-class or 10-class classification problem, (3) k, the number of folds for crossvalidation, and output: (1) the test set error rates for each of the k folds.

The wrapper code has no input and is used to prepare the datasets, and make calls to my cross val(method,X,y,k) to generate the results for each dataset and each method. Make sure the calls to my cross val(method,X,y,k) are made in the following order and add a print to the terminal before each call to show which method and dataset is being used:

LinearSVC with Boston50; 2. LinearSVC with Boston75; 3. LinearSVC with

Digits,

SVC with Boston50; 5. SVC with Boston75; 6. SVC with Digits,
LogisticRegression with Boston50; 8. LogisticRegression with Boston75;
LogisticRegression with Digits.

For example, the first call to my cross val(method,X,y,k) with k = 10 should result in the following output:

Error rates for LinearSVC with Boston50:

Fold 1: ###

Fold 2: ###

Fold 10: ###

Mean: ###

Standard Deviation: ###

Summary of results: For each dataset and each method, report the test set error rates for each of the k = 10 folds, the mean error rate over the k folds, and the standard deviation of the error rates over the k Make a table to present the results for each method and each dataset (9 tables in total). Include a column in the table for each fold, and add two columns at the end to show the overall mean error rate and standard deviation over the k folds. For example:

		Error rates for LinearSVC with Boston50
F1	F2	F3	F4	F5	F6	F7	F8	F9	F10	Mean	SD
#	#	#	#	#	#	#	#	#	#	#	#

(ii) Develop code for my train test(method,X,y,,k), which performs random splits on the data (X,y) so that [0,1] fraction of the data is used for training using method, rest is used for testing, and the process is repeated k times, after which the code returns the error rate for each such train-test split. Using my train test, with = 0.75 and k = 10, report the mean and standard deviation of error rate for the three methods: LinearSVC, SVC, and LogisticRegression, applied to the three classification datasets: Boston50, Boston75, and Digits.

You will have to submit (a) code and (b) summary of results for my train test:

(a) Code: You will have to submit code for my train test(method,X,y,,k) (main file) as well as a wrapper code q3ii().

This main file has input: (1) method, which specifies the (class) name of one

of the three classification methods under consideration, (2) X,y, which is data for the 2-class or 10-class classification problem, (3) , the fraction of data chosen randomly to be used for training, (4) k, the number of times the train-test split will be repeated, and output: (1) the test set error rates for each of the k folds printed to the terminal.

The wrapper code has no input and is used to prepare the datasets, and make calls to my train test(method,X,y,,k) to generate the results for each dataset and each method (9 combinations in total). Make sure the calls to my train test(method,X,y,,k) are made in the following order and add a print to the terminal before each call to show which method and dataset is being used:

LinearSVC with Boston50; 2. LinearSVC with Boston75; 3. LinearSVC with

Digits,

SVC with Boston50; 5. SVC with Boston75; 6. SVC with Digits,
LogisticRegression with Boston50; 8. LogisticRegression with Boston75;
LogisticRegression with Digits.

(b) Summary of results: For each dataset and each method, report the test set error rates for each of the k = 10 runs with = 0.75, the mean error rate over the k folds, and the standard deviation of the error rates over the k folds. Make a table to present the results for each method and each dataset (9 tables in total). Include a column in the table for each run, and add two columns at the end to show the overall mean error rate and standard deviation over the k runs.

The problem considers a preliminary exercise in feature engineering with focus on the Digits dataset. Represented as (X,y), the Digits dataset has X R^17976[3], i.e., 1797 training points, each having 64 features, and y {0,1,,9}¹⁷⁹⁷, i.e., 1797 training labels with each y_i {0,1,,9}. We will consider three methods from scikit-learn: LinearSVC, SVC, and LogisticRegression for this problem. Use the following parameters for the different methods mentioned:

LinearSVC: max iter=2000

SVC: gamma=scale, C=10

LogisticRegression: penalty=l2, solver=lbfgs, multi class=multinomial, max iter=5000

For the Digits dataset, starting with X R¹⁷⁹⁷⁶⁴, you will create a new feature representation X₁ R¹⁷⁹⁷³²as follows: Construct a (random) matrix G R⁶⁴³²where each element g_ij N(0,1), i.e., sampled independently from a univariate normal distribution, and then compute X₁= XG. Using (X₁,y), perform 10-fold crossvalidation⁴using the three methods: LinearSVC, SVC, and LogisticRegression, and report the mean and the standard deviation of the 10-fold test set error rate.^[4] The creation of X₁will be done based on a function rand proj(X,d), where d = 32 for this problem, and the function will return X₁.
For the Digits dataset, starting with X R¹⁷⁹⁷⁶⁴, you will create a new feature representation X₂ R^17972144as follows: For any training data x_i R⁶⁴, let the elements be x_ij,j = 1,, The new feature set x_i R²¹⁴⁴will include all the original features x_ij,j = 1,,64, squares of the original features x²_ij,j = 1,,64, and products of all the original features x_ijx_ij0,j < j⁰,j = 1,,64,j⁰= j+1,,64. You should verify that the new x_i R²¹⁴⁴and hence X₂ R^17972144. Using (X₂,y), perform 10-fold cross-validation⁴using the three methods: LinearSVC, SVC, and LogisticRegression, and report the mean and the standard deviation of the 10-fold test set error rate. The creation of X₂will be done based on a function quad proj(X), and the function will return X₂.

You will have to submit (a) code and (b) summary of results for all three parts:

Code: You will have to submit code for rand proj(X,d), quad proj(X) as well as a wrapper code q4().

rand proj(X,d) has input: (1) X, which is data (features) for the classification problem, (2) d, the dimensionality of the projected features, and output: (1) X R^1797d, the new data for the problem. This output array does not need to be printed to the terminal. quad proj(X) has input: X, which is data (features) for the classification problem, and output: (1) X₂, the new data with all linear and quadratic combinations of features as described above. This output array does not need to be printed to the terminal.

The wrapper code has no input and uses these above functions to execute all the classification exercises outlined in (i) and (ii) above and print the test set error rates for each of the k folds to the terminal. Make sure the exercises are executed in the following order and add a print to the terminal before each execution to show which method and dataset is being used:

LinearSVC with X₁; 2. LinearSVC with X₂,
SVC with X₁; 4. SVC with X₂,
LogisticRegression with X₁; 6. LogisticRegression with X₂.

Summary of results: For each dataset, i.e., X₁and X₂, and each method, report the mean error rate over the k folds, and the standard deviation of the error rates over the k Make a table to present the results for each method and each dataset (6 tables in total). Include a column in the table for each fold, and add two columns at the end to show the overall mean error rate and standard deviation over the k folds.

[1] It is ok to leave the solution in terms of a linear system, say Av = b, where AR³³, bR³are known, and v = [v₁v₂v₂]^TR³is a vector of the unknown parameters. If you choose to do this, please also mention your preferred approach to solve such a linear system.

[2] A correct answer with insufficient or incorrect explanation will not get any credit. ³For this problem, you can use python libraries for the computations.

[3] Please use your own code my cross val for this problem.

[4] Since G is a random matrix, every time you generate G and repeat the procedure, your results will be a bit different.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] CSCI5521-Homework 1

Reviews

Related products

[Solved] CSCI5521-Homework 3

[Solved] CSCI5521-Homework 2

[Solved] CSCI5521-Homework 4