, , , ,

[SOLVED] GR5242 HW01 Problem 1 Basics R

$25

File Name: GR5242_HW01_Problem_1_Basics_R.zip
File Size: 282.6 KB

5/5 - (1 vote)

GR5242 HW01 Problem 1: Basics

Instructions: This problem is an individual assignment — you are to complete this problem on your own, without conferring with your classmates. You should submit a completed and published notebook to Courseworks; no other files will be accepted.

Description: The goal of this problem is to get your familiar with neural network training from end to end.

Our main tool is torch , especially torch.nn and torch.optim , that helps us with model building and automatic differentiation / backpropagation.

There are 4 questions in this notebook, including 3 coding quesitons and 1 text question. Each coding question expects 1~3 lines of codes, and the text question expects just 1 sentence of explanation.

In [ ]:
# PyTorch imports:

#

# torch is the base package, nn gives nice classes for Neural Networks,

# F contains our ReLU function, optim gives our SG method,

# DataLoader allows us to do batches efficiently,

# and torchvision is for downloading MNIST data directly from PyTorch

import torch

from torch import nn

from torch.nn import functional as F

import torch.optim as optim

from torch.utils.data import DataLoader

import torchvision.transforms as transforms

import torchvision.datasets as datasets

# Helper libraries

import numpy as np

import matplotlib.pyplot as plt

print(torch.__version__)

Dataset

We will working on mnist dataset, which contain images of written digits of 0-9 and corresponding labels.

We have it set up to download the data directly from the torch library.

In [ ]:
# First, we will define a way of transforming the dataset automatically

# upon downloading from pytorch

# first convert an image to a tensor and then scale its values to be between -1

transform. = transforms.Compose([transforms.ToTensor(),

transforms.Normalize((0.5,), (0.5,)),])

# Next, we fetch the data

mnist_train = datasets.MNIST(root=’./data’, train=True,

download=True, transform=transform)

mnist_test = datasets.MNIST(root=’./data’, train=False,

download=True, transform=transform)

# and define our DataLoaders

train_loader = DataLoader(mnist_train, batch_size=32, shuffle=True)

test_loader = DataLoader(mnist_test, batch_size=32, shuffle=True)

Each image is represented as a 28×28 matrix of pixel values, and each label is the corresponding digit.

Let’s show an image of a random one! Try running the below cell a few times to see different examples and how the DataLoaders will be shuffling batches.

Note: Why is this random, when there is no random code in the next cell? The randomness comes from shuffle=True in the train_loader !

In [ ]:
inputs, classes = next(iter(train_loader))

plt.imshow(inputs[23].squeeze())

plt.title(‘Training label: ‘+str(classes[23].item()))

plt.show()

Let’s now show 25 of them in black and white:

In [ ]:
plt.figure(figsize=(10,10))

for i in range(25):

plt.subplot(5,5,i+1)

plt.xticks([])

plt.yticks([])

plt.grid(False)

plt.imshow(inputs[i].squeeze(), cmap=plt.cm.binary)

plt.xlabel(classes[i].item())

plt.show()

By printing out the shapes, we see there are 60,000 training data and 10,000 test data. Each image is represented as a 28×28 matrix of pixel values, and each label is the corresponding digit.

In [ ]:
# For training data

train_data_example, train_label_example = mnist_train[0]

print(“Shape of a single training image:”, train_data_example.shape)

# For test data

test_data_example, test_label_example = mnist_test[0]

print(“Shape of a single test image:”, test_data_example.shape)

# The total number of images in each dataset

print(“Total number of training images:”, len(mnist_train))

print(“Total number of test images:”, len(mnist_test))

Recap of classification task

In a classification task with K classes, suppose the predicted logits for an image are s1, …, sK. The predicted probabilities are then

The CrossEntropy (CE) loss is defined as

where ti = 1 if the image belongs to the th class or otherwise ti = 0.

Model

Now, we will build a model to predict the logits of images for the classificaiton task.

Question 1: Building the Model

In the following, we will write a class for a basic one-hidden-layer, ReLU, feedforward network. There are a few components to a model in Pytorch, and we will break them down step by step.

First, we need to define the class. As with any class definition, we start with an __init__ method. Since Pytorch provides us with many useful features within the torch.nn.Module class, we will use inheritence to pass these down to our Net class. This involves putting nn.Module inside the parenthesis in the class definition, and a super().__init__() call in the __init__() method.

Within the initialization, we then define two layers: one hidden layer with 128 neurons, and one output layer with 10 class logits. The hidden layer should take an input of size 28 x 28 and give an output of size 128 , while the output layer takes input of size 128 and gives output of size 10 . It is suggested to use the nn.Linear() object to accomplish this, which applies a transformation z = xWT + b.

Next, we define a special method called forward(), which defines how data propagate through the model. This method will be called either by model.forward(x) or by model(x) , and is where Pytorch looks for the information for its automatic derivative computation capabilities.

In the forward method, we first will reshape our image img using img.view() .

Then, we will apply the hidden layer (the one we defined) and the ReLU function F.relu .

Finally, we apply the output layer and return our output. Importantly, do not apply SoftMax to the output just yet. We will handle that part later

In [ ]:
class Net(nn.Module):

def __init__(self):

super(Net, self).__init__()

### YOUR CODE HERE ###

# define hidden layer and output layer below:

######################

def forward(self, img):

x = img.view(-1, 28*28) # reshape the image to be a single row

# pass x through both layers, with ReLU in between

### YOUR CODE HERE ###

######################

return x

model = Net()

Question 2: Defining the Loss and Optimizer

When training a torch model, typically you need to specify the following two items:

optimizer: specifies a way to apply gradient descent update of model parameters. We will use the optim.Adam optimizer with a learning rate of 0.001 in this example.

loss_fn: the objective function to minimize over. In classification task, the cross-entropy loss is used.

Please fill in the optimizer with an appropriate learning rate lr , and choose an appropriate number of epochs (number of passes through the data) in the following code.

Note: remember that the neural network outputs the logits instead of the class probabilities (why? answer the question below), and make sure to specify this in the loss function .

In [ ]:
loss_fn = nn.CrossEntropyLoss()

### YOUR CODE HERE ###

######################

Question 3: The neural network specified above does not output class probabilities, because the last layer of the neural network is a linear layer which outputs value ranging from (-∞, ∞). Your choice of loss function above should take care of that, but what mathematical function maps these logit values to class probabilities?

#

YOUR ANSWER HERE

#

Training

Now let’s train the model for your chosen number of epochs. By the end of the training, you should expect an accuracy above 0.98.

In each step, we need to:

1.) grab x and y from the batch (note that each batch is a tuple of x and y )

2.) zero the optimizer’s gradients

3.) make a prediction y_pred

4.) call the loss_fn between y and y_pred

5.) backpropagate

6.) make the approprite step calculated by the optimizer

In [ ]:
epochs = 10

for epoch in range(epochs):

losses = []

accuracies = []

for batch in train_loader:

correct, total = 0, 0

x_batch, y_batch = batch

optimizer.zero_grad()

### YOUR CODE HERE ###

######################

for index, output in enumerate(y_logit):

y_pred = torch.argmax(output)

if y_pred == y_batch[index]:

correct += 1

total += 1

### YOUR CODE HERE ###

######################

loss.backward()

optimizer.step()

losses.append(loss.item())

accuracies.append(correct/total)

avg_loss = np.mean(np.array(losses))

avg_accuracy = np.mean(np.array(accuracies))

print(‘epoch ‘ + str(epoch+1) + ‘ average loss: ‘, avg_loss,

‘– average accuracy: ‘, avg_accuracy)

Test Evaluation

Finally, we evaluate our model on the test set. You could expect the test accuracy to be slightly lower than the training accuracy.

In [ ]:
with torch.no_grad():

correct = 0

total = 0

for batch in test_loader:

x_batch, y_batch = batch

y_logit = model(x_batch)

for index, output in enumerate(y_logit):

y_pred = torch.argmax(output)

if y_pred == y_batch[index]:

correct += 1

total += 1

print(‘testing accuracy:’, correct/total)

Make Prediction

Question 4: fill in the following code block to estimate class probabilities and make predictions on test images. The results should be stored in class_probabilities and predicted_labels . Compare to the true labels, stored in true_labels by computing the accuracy. It should be the same as above.

(Hint: you can use much of the same structure from the cell above. You can use F.softmax to calculate probabilities from the logits, and store the results however you please.)

In [ ]:
### YOUR CODE HERE ###

########################

print(‘accuracy verification: ‘, sum(true_labels==predicted_labels)/len(true_la

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] GR5242 HW01 Problem 1 Basics R
$25