CMPT310-HW4
Detecting Handwritten Prime Digits with Neural Networks
1 Introduction
November 20, 2019
100 points
Prime numbers are fantastic. In this assignment we will use Multi Layer Per- ceptrons to detect handwritten prime digits. But before doing such a difficult task I suggest to try and solve an easier problem with MLP. If you succeed, which I know you will, you can proceed with tackling the challenging problem of detecting handwritten prime digits.
2 Regression Toy Data
The first task is to learn the function that generated the following data, using a simple neural network.
Thefunctionthatproducedthisdataisactuallyy=x2+,where N(0,) is a random noise from a normal distribution with a small variance. We are go- ing to use an MLP with one hidden layer, which each has 5 neurons, to learn
1
an approximation of this function using the data that we have. This assign- ment comes with a starting code, which is incomplete and you are supposed to complete it.
2.1 Technical Details
2.1.1 Code
The code that comes with this assignment has multiple files including:
assignment
toy example regressor.py
layers.py
neural network.py utils.py
toy_example_regressor.py contains most of the codes that are related to the training procedure, including loading data and iteratively feeding mini-batches of data to the neural network, plotting the approximated function and the data, etc. Please read this file and understand it but you dont need to modify this.
layers.py contains definition of the layers that we use in this assignment, including DenseLayer, SigmoidLayer, and L2LossLayer. Your main re- sponsibility is to implement forward and backward functions for these layers.
neural_network.pycontainsthedefinitionofaneuralnetwork(NeuralNet class), which is an abstract class. This class basically takes care of running forward pass and propagating the gradients, backwards, from loss to the
first layer.
utils.py contains some useful functions.
2.1.2 Data
The training data for this problem, which consists of input data and labels, can be generated by the function get_data(), which you can find in the main file, toy_example_regressor.py.
2.1.3 Network Structure
For the regression problem (i.e. the first task) we defined a new class, SimpleNet , which is inherited from NeuralNet. SimpleNet contains two DenseLayer s, which one of them has hidden neurons with Sigmoid activation functions. Network definition can be found in toy_example_regressor.py.
2
2.2 Your Task (total: 80 points)
2.2.1 Implementing compute_activations, compute_gradients, and update_weights functions
There are three type of layers completely implemented in the layers.py file: DenseLayer, Sigmoid, and L2LossLayer. However, implementation of DenseLayer
is incomplete. You are supposed to implement the following functions
DenseLayer: This is a simple dense (or linear or fully connected) layer
that has two types of parameters: weights w and biases b.
compute_activations (15 points): The value of every output neuron is oi = x.wi + bi. The number of input and output neurons are specified in the __init__ function.
compute_gradients (20 points): Assume that gradient of the loss with respect to the output neurons, self._output_error_gradient , are computed by the next layer already. You need to compute the gradients of the loss with respect to all the parameters of this layer (i.e. b and w) and store them in self.dw and self.db so that you can use them the update the parameters later. Needless to say that shape of dw and w should be equal, and same goes for db and b. In addition, you should compute the gradient of the loss with respect to the input, which is the output of the previous layer, and store it in self._input_error_gradient. This value will be passed on to the previous layer in the network, which will be used to compute the gradients recursively (Back Propagation).
update_weights (10 points): You should perform Stochastic Gradi- ent Descent and update the weights using the current weights, gradi- ents, and the given learning rate ( newweights = currentweights learningr ate gradient )
You can refer to the implementations of Sigmoid and L2LossLayer. Note: Its up to you how to implement these functions. However, it would be computationally less expensive if you use numpy matrix operations to compute the value of the neurons or gradients.
Your next task is to implement same functions for the NeuralNet class.
compute_activations (10 points): Iterate over all layers starting from the first one and compute the activations for each layer. At the end return the output of the last layer along with the value of loss.
compute_gradients (10 points): You are supposed to perform back prop- agation. In other words, first compute the gradient of the lass layer and pass it to the last layer. Then starting from the last layer iterate over all layers backwards, first compute its gradient and then pass it to the previous layer.
3
update_weights (5 points): You should update the parameters of all the layers (i.e. those who have parameters). You can use the update_weight function of each layer
2.2.2 Training the model (10 points)
Once you are done with implementing and testing the correctness of the im- plemented functions, you are ready to build a multi layer Perceptron and train it.
There is already an existing starter code for you at toy_example_regressor .py. Its a script that contains definition of a simple two layer MLP with scripts that loads the data and trains the MLP. At the end of the training the code plots data and the approximated function. Also the network weights will be saved to a file with this name simple_net_weights-{timestamp}.pkl. You should change its name to simple_net_weights.pkl and include it in your submission.. The timestamp is added to make sure you dont overwrite some previously well trained model.
In addition, you should check the loss and the saved image. Check if the predicted function is similar to f(x) = x2 and matches the validation data. You should include the saved image, data_function.png, both in your report and in your submission.
Note that this is a regression task, so the last layer of the MLP only has one neuron without any activation functions.
3 Detecting Prime Digits
Now we can use the same layers to distinguish one digit prime numbers (i.e. 2,3,5, and 7) from one digit composite numbers (i.e. 1, 4, 6, 8, and 9). This is a binary classification problem. So the MLP that we are going to use will have only one neuron in the last layer with a sigmoid activation function.
3.1 Data
The dataset that you will be using for this task is the MNIST dataset, which contains gray scale images of hand written digits. Sizes of images are 28 by 28 pixels. We have already preprocessed it for you. We have set the labels for prime digits to 1, and 0 otherwise. Also we have normalized the values of pixels (i.e. pixel values are in the interval of [0, 1]).
3.2 Network Structure
The input of the network is a vector with 784 neurons (28 28) The network has one hidden layer with 20 neurons with sigmoid activation functions. The output of the network is one neuron with sigmoid activation function.
4
3.3 Your Task (20 points)
For this task, all you need to do is to read and understand the prime_classifier .py code and then run it. Over the course of training, loss values and accuracies on the validation set is printed. At the end of the training, network parame- ters will be saved in a file with the following format prime_net_weights-{ timestamp}.pkl. You should change its name to prime_net_weights.pkl and include it in your submission.. The timestamp is added to make sure you dont overwrite some previously well trained model.
4 Testing Your Code
To help you with testing your code, a number of tester files have been included. You can use them to test your implementations. For example if you run:
$ python test_layers.py
You can see if the three functions (DenseLayer.compute_activations(), DenseLayer.compute_gradients(), and DenseLayer.update_weights()) that you implemented in layers.py file work properly or not. We recommend to use all the four test files that are included:
testers
test layers.py
test neural network.py
test toy example regressor.py test prime classifier.py
Also, you can estimate the total points that you might get for this assign- ment by running the following code:
$ python evaluate_assignment.py
Note: We will use stronger test cases to test your code and grade your assignment. So passing these tests does not guarantee anything. These tests are only meant to help you with this assignment.
5 What to submit
You should include all the following files in a tar.gz or zip file with your student id (either YOUR_STUDENT_ID.tar.gz or YOUR_STUDENT_ID.zip).
1. Your code. Do not change the signature of the functions that you were supposed to implement. Do NOT include the dataset (assignment4- mnist.pkl)
2. You should include the following files that are automatically saved in your submission:
prime_net_weights.pkl simple_net_weights.pkl
5
data_function.png
3. A report.pdf file concisely explaining what you did in this assignment. Also in your report include your models loss (for both problems) and accuracy (only for prime digit detection).
Note: your code will be evaluated with an automated script. So if you dont follow the above steps, you will lose all or a portion of your points for this assignment.
6
Reviews
There are no reviews yet.