ST340 Programming for Data Science
Assignment 2
Released: Monday week 5, 20191028; Deadline: 12:00 on Monday week 8, 20191118.
Instructions
Work individually.
Specify your student numbers and names on your assignment.
Any programming should be in R. Your report should be created using R markdown. Submit a single
knitted pdf document which includes any code you have written.
Q1 Expectation Maximization
For the EM algorithm with the mixture of Bernoullis model, we need to maximize the function
f1:Kf1,,K
where for i1,,N each xi0,1p, for k1,,K each kk1,k2,,kp0,1p, and
pxik
a Show that the unique stationary point is obtaineqd by choosing for each k1, . . . , K
i1 k1 Yp
kj j1
yn yK
ik logpxik, xij 1kj1xij .
ni1 x q ik i.
k ni1 ik
b The newsgroups dataset contains binary occurrence data for 100 words across 16,242 postings. Postings are tagged by their highest level domain; that is, into four broad topics comp., rec., sci., talk.. The dataset includes documents, a 16, 242100 matrix whose i, jth entry is an indicator for the presence of the jth word in the i post; newsgroups, a vector of length 16, 242 whose ith entry denotes the true label for the ith post i.e., to which of the four topics the ith post belongs; groupnames, naming the four topics; and wordlist, listing the 100 words.
i Run the EM algorithm for the mixture of Bernoullis model on the newsgroups data with K4. You should use some of the code from the EM Lab to help you. A run on the newsgroups dataset could take over 10 minutes so it is recommended to test your code on a small synthetic dataset first.
ii Comment on the clustering provided by your run of the algorithm. Can you measure its accuracy?
Q2 Twoarmed Bernoulli bandits
a Implement both Thompson sampling and the decreasing strategy in this setting with the unknown
success probabilities of the arms being 0.6 and 0.4. 1 b Describe the behaviour of decreasing when the sequence nn1 is defined by nmin1,Cn
,
where C is some positive constant, and check whether it is consistent with your implementation. 2 c Describe the behaviour of decreasing when the sequence nn1 is defined by nmin1,Cn ,
where C is some positive constant, and check whether it is consistent with your implementation.
d Compare and contrast the implementations of decreasing and Thompson sampling for this problem.
1
Q3 k nearest neighbours
a Create a function to do kNN regression using a usersupplied distance function, i.e.
Predicted labels should use the inversedistance weighting to each neighbour.
b Test your function on the following two toy datasets using distances.l1 from lab 6. Try dierent
values of k and report your results. Toy dataset 1:
Toy dataset 2:
c Load the Iowa dataset see ?lasso2::Iowa for details. Try to predict the yield in the years 1931, 1933, based on the data from 1930, 1932,
d Try dierent values of k, and compare your results with ordinary least squares regression and ridge regression.
2
knn.regression.testfunctionk,train.X,train.Y,test.X,test.Y,distances YOUR CODE HERE
printsumtest.Yestimates2
n100
train.Xmatrixsortrnormn,n,1
train.Ytrain.X0.5train.Xtrain.X0rnormn,sd0.03 plottrain.X,train.Y
test.Xmatrixsortrnormn,n,1
test.Ytest.X0.5test.Xtest.X0rnormn,sd0.03
k2 knn.regression.testk,train.X,train.Y,test.X,test.Y,distances.l1
train.Xmatrixrnorm200,100,2 train.Ytrain.X,1
test.Xmatrixrnorm100,50,2 test.Ytest.X,1
k3 knn.regression.testk,train.X,train.Y,test.X,test.Y,distances.l1
install.packageslasso2 librarylasso2
dataIowa train.Xas.matrixIowaseq1,33,2,1:9 train.YcIowaseq1,33,2,10 test.Xas.matrixIowaseq2,32,2,1:9 test.YcIowaseq2,32,2,10
k5 knn.regression.testk,train.X,train.Y,test.X,test.Y,distances.l2
Reviews
There are no reviews yet.