matlabtoolboxes, libraries, or other simulatorsmatlabknnnet
A database of sodium levels, (L) and blood pressure (P) has been compiled, with patients labeled as positive (D=1) or negative (D=0) for the disease based on the expensive test.
Problem 1. Try two dierent approaches for the task:
1. A k-nearest neighbors classier.
2. A neighborhood-based classier.
For the k-NN classier, you should systematically try several values of k from kmin = 1 to kmax, where you can decide kmax, but it must be no less than 11. For example, you can try k = {1,3,5,7,9,11}. You should compare the results for all the cases you tried using the hit-rate performance metric, and decide which value of k you would recommend.
For the neigborhood-based classier, instead of choosing a xed k, you will choose a circle of radius R around the data point, and use all the labeled points within this circle to decide the class. As with k, you should systematically try out at least 7 dierent values of R between a value Rmin and a value Rmax. You can choose both these parameters, but please choose reasonable values over a fairly wide range. Based on your simulations, recommend which R value is best using the hit-rate performance metric.
You should think about the numerical ranges of L and P data to see whether any prior scaling is needed. If you do rescale the data, please use the same rescaled data for both methods.
Then plot two graphs:
1. Performance (hit-rate) value of the k-NN classier for dierent values of k.
2. Performance (hit-rate) value of the neighborhood-based classier for dierent values of R.
Problem 2.
Implement the perceptron algorithm, and train a perceptron to do the classication
on the given dataset. You will need to specify a learning rate, choose a policy for initializing the weights, and decide how long you will train (i.e., how many epochs). You may have to run some trials to decide on the best value for these things. You will need to split your data into a training set and a test set perhaps using an 80/20 split.
Using the best parameter setting, train the perceptron for N epochs, where each epoch is one pass through the whole training set. You can choose N, but it must be large enough to allow for reasonably complete training. Before training (i.e., epoch 0) and then after every 10 epochs (i.e., at epochs 10, 20, .), calculate two error values:
1. Etrain: (1 hit-rate) for the training set.
2. Etest: (1 hit-rate) for the test set.
Report your results by rst describing briey how you decided on your weight initialization, learning rate, etc., and then plotting a graph with epoch number on the x-axis and Etrain and Etest on the y-axis. Both curves must be plotted on the same graph. Then discuss what the results indicate to you about the success or failure of the perceptron, and whether they suggest something about how many epochs you should have trained.
Reviews
There are no reviews yet.