[SOLVED] deep learning math python graph network CIFAR2, not CIFAR10

$25

File Name: deep_learning_math_python_graph_network_CIFAR2,_not_CIFAR10.zip
File Size: 555.78 KB

5/5 - (1 vote)

CIFAR2, not CIFAR10
Your task is a binary classification problem. While the CIFAR10 dataset has 10 possible classes (airplane, automobile, bird, cat, deer, frog, horse, ship, and truck), you will build a CNN to take in an image and correctly predict its class to either be a cat or dog, hence CIFAR2. We limit this assignment to a binary classification problem so that you can train the model in a reasonable amount of time.
The assignment has2 parts.
Our stencil provides a model class with several methods and hyperparameters you need to use for your network. You will also fill out a function that performs the convolution operator. You will also answer a questions related to the assignment and class material as part of this assignment.
Part 1: The Model
Roadmap
You will notice that the structure of the Model class is very similar to the Model class defined in your first assignment.We strongly suggest that you first complete the Intro to TensorFlow Lab before starting this assignment.The lab includes many explanations about the way a Model class is structured, what variables are, and how things work in TensorFlow. If you come into hours with questions about TensorFlow related material that is covered in the lab, we will direct you to the lab.
Below is a brief outline of some things you should do. We expect you to fill in some of the missing gaps (review lecture slides to understand the pipeline) as this is your second assignment.
Step 1. Preprocess the data
We have provided you with a functionunpickle(file)in the preprocess file stencil, which unpickles an object and returns a dictionary. Do not edit it. We have already extracted the inputs and the labels from the pickled file into a dictionary for you, as you can see withinget_data.
You will want to limit the inputs and labels returned byget_datato those representing the first and second classes of your choice. For every image and its corresponding label, if the label is not of the first or second class, then remove the image and label from your inputs and labels arrays.
At this point, your inputs are still two dimensional. You will want to reshape your inputs into (num_examples, 3, 32, 32) usingnp.reshape(inputs, (-1, 3, 32 ,32))and then transpose them so that the final inputs you return have shape (num_examples, 32, 32, 3).
Recall that the label of your first class might be something like 5, representing a dog in the CIFAR dataset, but you will want to turn that to a 0 since this is binary classification problem. Likewise, for all images of the second class, say a cat, you will want to turn those labels to a 1.
After doing that, you will want to turn your 0s and 1s to one hot vectors, where the index with a 1 represents the class of the correct image. You can do this with the functiontf.one_hot(labels, depth=2).
This is be a bit confusing so well just make it clear:your labels should be of size (num_images, num_classes).So for the first example, the corresponding label might be [0, 1] where a 1 in the second index means that its a cat/dog/hamster/sushi.
Step 2. Create your model
You will not receive credit if you use the tf.keras, tf.layers, and tf.slim libraries. You can use tf.keras for your optimizer but do NOT use Keras layers!
Again, you should initialize all hyperparameters within the constructor even though this is not customary. This is still necessary for the autograder. Consider whats being learned in a CNN and intialize those as trainable parameters. In the last assignment, it was our weights and biases. This time around, you will still want weights and biases, but there are other things that are being learned!
We recommend using an Adam Optimizer [tf.keras.optimizers.Adam] with a learning rate of 1e-3, but feel free to experiment with whatever produces the best results.
Weight variables should be initialized from a normal distribution (tf.random.truncated_normal) with a standard deviation of 0.1.
You may use any permutation and number of convolution, pooling, and feed forward layers, as long as youuse at least one convolution layer with strides of [1, 1, 1, 1], one pooling layer, dropout, and one fully connected layer.
If you are having trouble getting started with model architecture, we have provided an example below:
Convolution Layer 1 [tf.nn.conv2d]
16 filters of width 5 and height 5
strides of 2 and 2
same padding
Batch Normalization 1 [tf.nn.batch_normalization]
Get the mean and variance using [tf.nn.moments]
ReLU Nonlinearlity 1 [tf.nn.relu]
Max Pooling 1 [tf.nn.max_pool]
kernels of width 3 and height 3
strides of 2 and 2
Convolution Layer 2
20 filters of width 5 and height 5
strides of 1 and 1
same padding
Batch Normalization 2
ReLU Nonlinearlity 2
Max Pooling 2
kernels of width 2 and height 2
strides of 2 and 2
Convolution Layer 3
20 filters of width 5 and height 5
strides of 1 and 1
same padding
Batch Normalization 3
ReLU Nonlinearlity 3
Dense Layer 1
Dropout with rate 0.3
Dense Layer 2
Dropout with rate 0.3
Dense Layer 3
Fill out the call function using the trainable variables youve created. Note that in the lab, we mentioned using a @tf.function decorator to tell TF to run it in graph execution. Do NOT do this for this assignment well explain why the forward pass has to be run in eager execution later. The parameteris_testingwill be used in Part 2, do not worry about it when implementing everything in this part.
Calculate the average softmax cross-entropy loss on the logits compared to the labels. We suggest usingtf.nn.softmax_cross_entropy_with_logits.
Step 4. Train and test
In the main function, you will want to get your train and test data, initialize your model, and train it for many epochs. We suggest training for 10 epochs. For the autograder, we will train it for at most 25 epochs (hard limit 10 of minutes). We have provided for you a train and test method to fill out. The train method will take in the model and do the forward and backward pass for a SINGLE epoch. Yes, this means that, unlike the first assignment, yourmainfunction will have a for loop that goes through the number of epochs, callingtraineach time.
Even though this is technically part of preprocessing, you should shuffle your inputs and labels when TRAINING. Keep in mind that they have to be shuffled in the same order. We suggest creating a range of indices of length num_examples, then usingtf.random.shuffle(indices). Finally you can usetf.gather(train_inputs, indices)to shuffle your inputs. You can do the same with your labels to ensure they are shuffled the same way.
You should also reshape the inputs into (batch_size, width, height, in_channels) before calling model.call().When training, you might find it helpful to actually calltf.image.random_flip_left_righton your batch of image inputs to increase accuracy. Do not call this when testing.
Call the models forward pass and calculate the loss within the scope oftf.GradientTape. Then use the models optimizer to apply the gradients to your models trainable variables outside of the GradientTape. If youre unsure about this part, please refer to the lab. This is synonymous with doing thegradient_descentfunction in the first assignment, except that TensorFlow handles all of that for you!
If youd like, you can calculate the train accuracy to check that your model does not overfit the training set. If you get upwards of 80% accuracy on the training set but only 65% accuracy on the testing set, you might be overfitting.
Thetestmethod will take in the same model, now with trained parameters, and return the accuracy given the test inputs and test labels.
At the very end, we have written a method for you to visualize your results. The visualizer will not be graded but you can use it to check out your doggos and kittens.
For fun, instead of passing in the indexes for dog and cats for your training and testing data, you can pass in other inputs and see how your model does when trying to classify something like bird vs. cat!
Your README can just contain your accuracy and any bugs you have.
Mandatory Hyperparameters
You can train with any batch size but you are limited to training for at most 25 epochs (I know, the title of this section is a bit misleading).However, your model must train using TensorFlow functions and test using your own convolution function within 10 minutes on a department machine. We will be timing this when autograding.Again, the parameters we suggest are training for 10 epochs using a batch size of 64.
Reading in the Data
The CIFAR files are pickled objects. We have provided you with a functionunpickle(filename). You should not edit it.Note:You should normalize the pixel values so that they range from 0 to 1 (This can easily be done by dividing each pixel value by 255) to avoid any numerical overflow issues.
Data format
The testing and training data files to be read in are in the following format:
train: A pickled object of 50,000 train images and labels. This includes images and labels of all 10 classes. After unpickling the file, the dictionary will have the following elements:
data a 500003072 numpy array of uint8s. Each row of the array stores a 3232 colour image. The first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. The image is stored in row-major order, so that the first 32 entries of the array are the red channel values of the first row of the image.
labels a list of 50000 numbers in the range 0-9. The number at index i indicates the label of the ith image in the array data.
Note that if you download the dataset from online, the training data is actually divided into batches. We have done the job of repickling all of the batches into one single train file for your ease.
test: A pickled object of 10,000 test images and labels. This includes images and labels of all 10 classes. Unpickling the file gives a dictionary with the same key values as above.
Weve already done the job of unpickling the file and have extracted the unprocessed inputs and labels in theget_datafunction.
To get only the images and labels of classes 3 and 5 (representing dog and cat), you will want to loop over the data and only add it to your result array of inputs and labels if they belong to those classes.
Visualizing Results
Weve provided thevisualize_results(image_data, probabilities, image_labels, first_label, second_label)method for you to visualize your predictions against the true labels using matplotlib, a useful Python library for plotting graphs. This method is currently written with the image_labels having a shape of (num_images, num_classes).DO NOT EDIT THIS FUNCTION.You should call this function after training and testing, passing into intovisualize_resultsan input of 10 images, 10 probabilities, 10 labels, the first label name, and second label name.
Unlike the first assignment, you will need to pass in the strings of the first and second classes. Avisualize_resultsmethod call might look like:visualize_results(image_inputs, probabilities, image_labels, cat, dog).
This should result in a visual of 10 images with your predictions and the actual label written above so you can compare your results! You should do this after you are sure you have met the benchmark for test accuracy.
Part 2: Conv2d
Before starting this part of the assignment,you should ensure that you have an accuracy of at least 70%on the test set using only TensorFlow functions for the problem of classifying dogs and cats.
As a new addition to this assignment, you will be implementing your very own convolution function! Deep Learning == TensorFlow tutorial no more!
For the sake of simple math calculations (less is more, no?), well require that ourconv2dfunctiononly works with a stride of 1(for both width and height). This is because the calculation for padding size changes as a result of the stride, which would be way more complex and unreasonable for a second assignment.
DoNOTchange the parameters of the function we have provided. Even though theconv2dfunction takes in a strides argument, you shouldALWAYSpass in [1, 1, 1, 1]. Leaving in strides as an argument was a conscious design choice if you wanted to eventually make the function work for other kinds of strides in your own time, this would allow you to easily change it.
Roadmap
Your inputs will have 4 dimensions. If we are to use thisconv2dfunction for the first layer, the inputs would be [batch_size, in_height, in_width, input_channels].
You should ensure that the inputs number of in channels is equivalent to the filters number of in channels. Make sure to add an assert statement or throw an error if the number of input in channels are not the same as the filters in channels. You will lose points if you do not do this.
If padding is the same, you will have to determine a padding size. Luckily, for strides of 1, padding is just(filter_size 1)/2. The derivation for this formula is out of the scope of this course, but if you are interested, you may read about ithere.
You can use this hefty NumPy functionnp.padto padd your input! Note that for SAME padding, the way you pad may result in different output shapes for inputs with odd dimensions depending on the way you pad. This is ok.We will only test that your convolution function works similarly to TensorFlows using inputs with even (ie. divisible by 2) dimensions for SAME padding.
After padding (if needed), you will want to go through the entire batch of images and perform the convolution operator on each image. There are two ways of going about this you can continuously append to multi dimensional NumPy arrays to an output array or you can create a NumPy array with the correct output dimensions, and just update each element in the output as you perform the convolution operator. We suggest doing the latter its conceptually easier to keep track of things this way.
Your output dimension height is equal to(in_height + 2*padY filter_height) / strideY + 1and your output dimension width is equal to(in_width + 2*padX filter_width) / strideX + 1. Refer to the slides if youd like to understand this derivation.
You will want to iterate the entire height and width including padding, stopping when you cannot fit a filter over the rest of the padding input. For convolution with many input channels, you will want to perform the convolution per input channel and sum those dot products together.
Testing out your ownconv2d:
We have provided for you a few tests that compare the result of your very ownconv2dand TensorFlowsconv2d. If youve implemented it correctly, the results should be very similar.
The last super important part of this project is that you should call yourconv2dfunction IN your model.TensorFlow cannot build a graph/differentiate with NumPy operators so you should not add a @tf.function decorator.
In your model, you should setis_testingto True when testing, then make sure that ifis_testingis True, you use your own convolution rather than TensorFlowsconv2don aSINGLEconvolution layer. If you follow the architecture described above, we suggest adding in an if statement before the third convolution layer (ie. switch out theconv2dfor your third convolution). This part will take the longest, and is why we say it might actually take up to 15 minutes on a local machine.
Autograder
Your model must complete training within 10 minutes AND/or under 25 epochs on a department machines.
Our autograder will import your model and your preprocessing functions. We will feed the result of yourget_datafunction called on a path to our data and pass the result to your train method in order to return a fully trained model. After this, we will feed in your trained model, alongside the TA pre-processed data, to our custom test function. This will just batch the testing data using YOUR batch size and run it through your modelscallfunction.However, we will test that your model can test with any batch size, meaning that you should not harcodeself.batch_sizein yourcallfunction.Thelogitswhich are returned will then be fed through an accuracy function. When testing your own convolution function, we will only test on inputs with even dimensions for SAME padding. This is because you might result in different output dimensions that TensorFlows convolution function when using SAME padding on odd inputs. In order to ensure you dont lose points, you need to make sure that you A) correctly return training inputs and labels fromget_data, B ) ensure that your modelscallfunction returns logits from the inputs specified, and that it does not break on different batch sizes when testing, C) make sure your own convolution function works, and D) no part of your code relies on any packages outside of TensorFlow, NumPy, MatplotLib, or the Python standard library.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] deep learning math python graph network CIFAR2, not CIFAR10
$25