[Solved] EE5934 Homework 1

$25

File Name: EE5934_Homework_1.zip
File Size: 160.14 KB

SKU: [Solved] EE5934 Homework 1 Category: Tag:
5/5 - (1 vote)

HW1 introduces you the dataset which will be used for this and other assignments together with some useful preprocessing methods. Later assignments will cover different classification methods.

In [1]:

import osimport syssys.path.append(../../homework) import matplotlib.pyplot as pltimport numpy as np %load_ext autoreload%autoreload 2

USPS

Read here to learn more about the USPS dataset.

In [2]:

# Load the USPS datausps_dir = ../data/usps.h5import h5pywith h5py.File(usps_dir, r) as hf: train = hf.get(train) trainX = train.get(data)[:] trainy = train.get(target)[:] test = hf.get(test) testX = test.get(data)[:] testy = test.get(target)[:]

In [16]:

# Print some statistics about the dataprint(train image data shape:, trainX.shape)print(train label data shape:, trainy.shape)print(test image data shape:, testX.shape)print(test label data shape:, testy.shape)

train image data shape: (7291, 256)

train label data shape: (7291,)

test image data shape: (2007, 256)

test label data shape: (2007,)

Out[16]:

0

Flatten the data

The 1616 USPS images have been flattened as 256 x 1 image vectors.

Flattening the data will lead to some loss of information (such as spatial correlation between pixels) but it is nevertheless useful for training some linear classifiers.

Write some code below to visualize the flattened data. From the visualization, is flattened data sufficiently discriminative for classification purposes? Please provide a brief explanation.

In [182]:

###################################################################### TODO: ## Visualize the data to illustrate that the flattened data is still ## good for classification ######################################################################import numpy as npfrom mpl_toolkits.mplot3d import Axes3D import matplotlib.pyplot as plt# PCAtrain_mean = np.mean(trainX,axis=-1,keepdims=True)X = trainX train_meanS = X.T.dot(X)_,D,Vh = np.linalg.svd(S)trainV = trainX.dot(Vh.T[:,:3])# Visualizefig = plt.figure()ax = Axes3D(fig)ax.scatter(trainV[:,0],trainV[:,1],trainV[:,2],s=1, c=trainy)ax.set_xlabel(X Label)ax.set_ylabel(Y Label)ax.set_zlabel(Z Label)ax.view_init(30, 10)plt.title(Figure1.Flatten data visuilization, fontsize=10)###################################################################### END OF YOUR CODE ######################################################################

Out[182]:

Text(0.5, 0.92, Figure1.Flatten data visuilization)

The flattened data is sufficiently discrimitive for classification purpose!

From the Figure.1, the data points with the same labels cluster together, which is obviously seperable.

A few more visualization exercises

Please complete the following to get a better understanding of the USPS data:

  1. Plot average images of digits 0-9 in one figure. What does this show or represent?
  2. Subtract the average pixel value computed from all training samples and then re-plot the above average images. How are these images useful?

In [200]:

###################################################################### TODO: ## Plot each average image of digits 0-9 ####################################################################### Plot average images 0-9means = {i:None for i in range(10)} fig2 = plt.figure(constrained_layout=True)spec2 = gridspec.GridSpec(2, 5, figure=fig2)plt.title(Figure2.average 0-9)plt.axis(off)for i in range(10): means[i] = np.mean(trainX[np.where(trainy==i)[0],:],axis=0) fig2.add_subplot(spec2[i//5, i%5]) plt.imshow(means[i].reshape(16,16))###################################################################### END OF YOUR CODE ######################################################################

In [202]:

###################################################################### TODO: ## Plot the average images of digits 1~9 in one figure ## (this time with average pixel values subtracted) ####################################################################### Ploat ava.0-9 after subtracting mean of alltrainX_mean = np.mean(trainX,axis=0,keepdims=True)X = trainX trainX_meanmeans = {i:None for i in range(10)} fig3 = plt.figure(constrained_layout=True)spec3 = gridspec.GridSpec(2, 5, figure=fig3)plt.title(Figure3.average 0-9 after subtracting mean of all)plt.axis(off)for i in range(10): means[i] = np.mean(X[np.where(trainy==i)[0],:],axis=0) fig3.add_subplot(spec2[i//5, i%5]) plt.imshow(means[i].reshape(16,16))###################################################################### END OF YOUR CODE ######################################################################

a

From the Figure3., we can see it represent the general writing of the digits.

b

From the Figure3, we see most of the pixles intend to extreme value comparing figures in the Figure4., which will cause our gradients go out of control easily.

In your submission, in addition to the completed codes and the resulting images for the visualizations, also provide brief responses to the above questions and a brief analysis of the visualized data.

You may refer to here for more interesting and beautiful visualisations. Of course, the methods employed are more complex.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] EE5934 Homework 1[Solved] EE5934 Homework 1
$25