[SOLVED] data structure algorithm matlab database graph theory MAT 167 FINAL PROGRAMMING PROJECT FQ 2019

$25

File Name: data_structure_algorithm_matlab_database_graph_theory_MAT_167_FINAL_PROGRAMMING_PROJECT_FQ_2019.zip
File Size: 894.9 KB

5/5 - (1 vote)

MAT 167 FINAL PROGRAMMING PROJECT FQ 2019
Read Chapter 6 and Chapter 10 up to, but not including, Section 10.3 in Eld en. Read Professor Saitos twenty-first lecture in NS LECTURE 21.pdf.
Write a MATLAB code to perform the following handwritten digit recognition computations.
Step 01 Download the handwritten digit database USPS.mat
from CANVAS and load this file into your MATLAB session.
(a) This file contains four arrays train patterns
test patterns
of size 256 4649 and
train labels
test labels of size 10 4649.
Rename the array train patterns to training digits training_digits = train_patterns;
Rename the array test patterns to test digits test_digits = test_patterns;
Rename the array train labels to training labels training_labels = train_labels;
You may find it helpful to think of these arrays as matrices. The arrays training digits and test digits contain a raster scan of the 16 16 gray level pixel intensities that have been normalized to lie within the range [1, 1]. The arrays training labels and test labels contain the true information about the digit images. That is, if the jth handwritten digit image in training digits truly represents the digit i, then the (i + 1,j)th entry of training labels is +1, and all the other entries of the jth column of training labels are 1.
(b) Now, display the first 16 images in training digits using subplot(4,4,k) and imagesc functions in MATLAB. Print out the figure and include it in your Programming Project LaTeX and PDF files.
Hint: You need to reshape each column into a matrix of size 16 16 followed by transposing it in order to display it correctly.
2019 Prof. E. G. Puckett Revision 4.00 1 Sat 23rd Nov, 2019 at 18:40

MAT 167 FINAL PROGRAMMING PROJECT FQ 2019
Step 02 Read the description of this step in Chapter 10.01 of the textbook and/or Professor Saitos Lecture 21. Compute the mean digits in the training digits and put them in a matrix called training averages of size 25610, and display these 10 mean digit images using subplot(2,5,k) and imagesc. Print out the figure as a PDF file and include it in your LaTeX and PDF documents.
Hint: You can gather (or pool) all the images in training digits corresponding to digit k 1 (1 k 10) using the following MATLAB command:
training_digits(:, training_labels(k,:)==1);
Step 03 Read the description of this step in Chapter 10.01 of the textbook and/or Professor Saitos Lecture 21. Now conduct the simplest classification computations as follows.
(a) First, prepare a matrix called test classification of size 4649 101 and fill this array by computing the Euclidean distance (or its square) between each image in the test digits and each mean digit image in training averages.
Hint: the following line computes the squared Euclidean distances between all of the test digit images and the kth mean digit of the training dataset with one line of MATLAB code:
sum((test_digits-repmat(training_averages(:,k),[1 4649])).^2);
(b) ) Compute the classification results by finding the position index of the minimum of each column of test classification. Put the results in a vector test classification res of size 1 4649.
Hint: You can find the position index giving the minimum of the jth column of test classification by
>> [tmp, ind] = min(test_classification(:,j));
Then, the variable ind contains the position index, an integer between 1 and 10, of the
smallest entry of test classification(:,j).
(c) ) Finally, compute the confusion matrix test confusion of size 10 10, print out this
matrix, and submit your results in the PDF file containing your report.
Hint: First gather the classification results corresponding to the k 1st digit by
>> tmp=test_classification_res(test_labels(k,:)==1);
This tmp array contains the results of your classification of the test digits whose true digit is k 1 for 1 k 10. In other words, if your classification results were perfect, all the entries of tmp would be k. But in reality, this simplest classification algorithm makes mistakes, so tmp contains values other than k. You need to count how many entries have the value j in tmp, for j = 1 : 10. This will give you the kth row of the test confusion matrix.
1 Watch out! At one point this was 10 4649, which is inconsistent with the dimensions of test classification in Step 03(b).
2019 Prof. E. G. Puckett Revision 4.00 2 Sat 23rd Nov, 2019 at 18:40

MAT 167 FINAL PROGRAMMING PROJECT FQ 2019 Step 04 Read the description of this step in Chapter 10.02 of the textbook and/or Professor Saitos
Lecture 21. Now conduct an SVD-based classification computation.
(a) Pool all of the images corresponding to the kth digit in the array training digits, compute the rank 17 SVD of that set of images; i.e., the first 17 singular values and vectors, and put the left singular vectors (or the matrix U) of the kth digit into the array left singular vectors of size 2561710. For k = 1 : 10, you can do this with the following code:
[left_singular_vectors(:,:,k),~,~] =
svds(training_digits(:,training_labels(k,:)==1),17);
You do not need the singular values and right singular vectors in this computation.
(b) Compute the expansion coefficients of each test digit image with respect to the 17 singular vectors of each train digit image set. In other words, you need to compute 1710 numbers for each test digit image. Put the results in the 3D array test svd17 of size 17464910. This can be done with the commands
for k=1:10
test_svd17(:,:,k) = left_singular_vectors(:,:,k) * test_digits;
end
(c) Next, compute the error between each original test digit image and its rank 17 approxi- mation using the kth digit images in the training data set. The idea of this classification is that a test digit image should belong to the class of the kth digit if the correspond- ing rank 17 approximation is the best approximation (i.e., the smallest error) among 10 such approximations. Prepare a matrix test digits rank 17 approximation of size 10 4649, and put those approximation errors into this matrix.
Hint: The rank 17 approximation of test digits using the 17 left singular vectors of the kth digit training images can be computed by
left_singular_vectors(:,:,k)*test_digits_rank_17_approximation(:,:,k);}
If this command gives an error, such as
MATLAB to become unresponsive. See array size limit or preference panel
for more information.
try replacing the command after the Hint above with the following code.
for k = 1:10
for j = 1:4649
tmp= norm(test_digits(:,j)
left_singular_vectors(:,:,k)*test_svd17(:,j,k));
test_digits_rank_17_approximation(k,j) = tmp;
[tmp,ind] = min(test_digits_rank_17_approximation(:,j));
svd_classification(1,j) = ind;
end
2019 Prof. E. G. Puckett Revision 4.00 3 Sat 23rd Nov, 2019 at 18:40

MAT 167
FINAL PROGRAMMING PROJECT FQ 2019
end
(d) Finally, compute the confusion matrix using this SVD-based classification method by following the same strategy as in Step 03(b) and Step 03(c) above. Name this confusion matrix test svd17 confusion. Include this matrix in your report and submit your results.
Step 05 ANALYZE YOUR RESULTS IN A WELL WRITTEN REPORT!
(a) For Step 01 explain your understanding of the data structure in which the images of the digits are stored. In particular, include a brief explanation of the difference between the training data and the test data. (This is a simple example of machine learning. These are most likely the first machine learning algorithms to be widely used in the real world.)
(b) Give an explanation of what you are doing in Step 02, and why you are doing it. You will find some helpful comments concerning Step 02 in Chapter 10.01 of Eld en. Include some thoughts to support your comments.
(c) Comment on the intermediate results at the end of Step 03 and at the end of Step 04. How effective is each algorithm; i.e, for that particular algorithm what percentage of each digit is identified correctly? Which digit is the most difficult to identify correctly? Which digit is the easiest to identify correctly? You can obtain all of this information from the confusion matrices you produced in Step 03 and Step 04. Include some thoughts to support your comments. In particular, in YOUR OWN WORDS explain the theory that is behind the algorithm in (a)(d). (This is discussed in detail in Chapter 10.2 of Eld en.)
(d) Summarize all of your results in a separate section at the end. Compare your results from Step 03, and Step 04. Which of the two algorithms yields the best result? Why?
Step 06 Submit a well documented MATLAB program named Digit Recognition youremailname.m
This program should perform all of the tasks in Step 01 to Step 04 above without any user input. It is sufficient to have your program print the various images and tables on the computer screen. In particular, your program does not have to have produce a PDF file containing the images of the digits produced in Step 01(b) and Step 02.
Again, here is a description of what is meant by a well documented MATLAB program.
Do not submit only the MATLAB source code without comments. Furthermore, do not include the bare minimum of explanation for each subsection of your code. Please consider using an active mind when including comments in your program. In particular, as programmers and highly educated individuals, it is worth your time to describe what you are doing in our own words for each individual segment of the code; i.e., each portion of the code that performs a separate task, even if it is only inputting a file. For example, What is the format of the file: binary, text, MATLAB data structures? What is contained in the file? How is it stored? Relate the algorithm(s) back to the theory we have been studying in lecture and in the homework assignments. When you read your own code, you should be able to easily identify what you have learned from this writing the program, and how this relates to the themes presented in lectures and in the textbook.
2019 Prof. E. G. Puckett Revision 4.00 4 Sat 23rd Nov, 2019 at 18:40

MAT 167 FINAL PROGRAMMING PROJECT FQ 2019
Step 07 You will be graded on the following items
(a) Your MATLAB program should meet the following specifications.
Not require any user input.
Your code should input the file USPS.mat
The TA should be able to run your code on his machine using just your *.m file and his copy of USPS.mat.
Your code should run without breaking.
The TA should not have to dig in variable explorers to see what youre talking about.
Display all output on the screen clearly and use variable names that makes sense; i.e., explicitly tells the TA or any other person who sees you output what they are looking at.
(b) Youll be graded on the correctness of the following steps.
1. (b) Your MATLAB program should display the digits in Step 01(b) and this image of the digits should be included in your report. Typically you create a PDF file and input it using the LaTeX command
includegraphics
2. Print out the figure as a PDF file and include it in your report
3.(c) Output your 10 10 test confusion matrix when your *.m file is run and also
include it in your report
4(d) Similarly output the confusion matrix from the SVD algorithm when your *.m file is run and also include it in your report
5. Clearly label 5(a), 5(b), 5(c), 5(d) and explain in detail.
6. Clearly comment each separate part of your code. If the TA has to guess what one or more lines of your code are doing, he is going to be concerned. He doesnt need a long explanation of whats going on, but a brief explanation on what this line or lines of code are accomplishing will suffice.
2019 Prof. E. G. Puckett Revision 4.00 5 Sat 23rd Nov, 2019 at 18:40

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] data structure algorithm matlab database graph theory MAT 167 FINAL PROGRAMMING PROJECT FQ 2019
$25