In this section, you are going to train models on the publicly available CIFAR-10 dataset source.
The CIFAR-10 dataset consists of 60000 32×32 color images in 10 classes, with 6000 images per
class. For more information, you are encouraged to look at their webpage. You are expected to
implement a Convolution Neural Network (CNN) to classify the images based on their context.Free T4 GPU in Colab is highly recommended for executing the code in this section.
1. Implement a shallow CNN with the layers mentioned below.
• A Convolution layer with 32 kernels of size 3×3
• A ReLU activation
• A Convolution layer with 64 kernels of size 3×3
• A ReLU activation
• A maxpool layer with kernels size of 2×2
• A convolution layer with 64 kernels of size 3×3• A ReLU activation
• A convolution layer with 64 kernels of size 3×3
• A ReLU activation
• A flattening layer. (This layer resizes a 3D tensor to a feature vector).
• A fully connected layer with an output size of 10. (Classes should be predicted
as numerical values (like 0-9).
Note: If you use Pytorch, you can use print(next(model.parameters()).device) to check if you’re
using the GPU for training.2. Use Pytorch Class torchvision.datasets.CIFAR10 to load the dataset.3. Training, validation and test settings are shown below.
• 50,000 images for training (training set). Divide the 10,000 test set images of CIFAR10
into two subsets by 1:1. 5,000 images for validation (validation set), and 5,000 images for
final testing (test set).• Batch size = 32.
• SGD optimizer with an initial learning rate of 0.002.
• Loss function: categorical cross entropy criterion.• Training iteration can be 90,000 or more (If you use epoch counting, epoch can be 58 or
more). Perform a validation every 5000 iterations (3 epochs). It may take about 30
minutes on T4 GPU. If you don’t have enough computational resources, you are allowed
to reduce the number of images by taking a subset of this dataset or reduce the training
iterations (epochs). State the size of your subset and iterations (epochs) in the README.
• Use the default setting for the rest of the hyperparameters.4. Plot the training loss, validation loss, and validation accuracy over the training iterations (or
epochs). Fig. 1 shows an example. State whether the training appears to be overfitting and why.Fig. 1. Training loss, validation loss, and validation accuracy over the training iterations.5. Please give the test accuracy on the test set from the iteration (or epoch) where the validation
accuracy is maximum as your test accuracy result.6. Let’s discuss the effects of the Kernel size. Change all kernel sizes to 5×5 and train a new
network with the same other hyperparameters. Compare the run time and the test accuracy of
models under different kernel sizes and briefly discuss the possible factors that affect the
performance of a CNN.7. Use Pytorch Class torchvision.models.resnet18 to implement a deep network ResNet18. Set the
training iteration as 6000 or more (If you use epoch counting, epoch can be 5 or more) and
perform a validation on the validation set every 500 iterations (1 epoch). Give the test accuracy
on the test set from the iteration (or epoch) where the validation accuracy is maximum as the
test accuracy result. The rest of the hyperparameters should be the same as the above shallow
CNN. Note that:7.1 By setting the parameter pretrained, you can choose to either train a new ResNet18 model
from scratch or fine-tune the ResNet18 model that has been fully trained on the ImageNet
dataset.7.2 Since the image size of CIFAR10 is 32×32 and the standard ResNet18 accepts 224×224
input by default, we may need to first resize the input image to 224×224 (You are free to use
other available transformation, such as padding). Besides, the output channel of the final
fully connected layer of ResNet18 needs to be modified to 10 to meet the classification
requirements of CIFAR10.Compare the impact of using a pre-trained ResNet18 versus not and discuss the reason.
Compare the test accuracy of the deep ResNet18 versus the shallow CNN.YOLOv8 is a state-of-the-art (SOTA) model for a wide range of object detection and tracking,
instance segmentation, image classification and pose estimation tasks. In this section, you will be
asked to take a photo of a street in Montreal and summarize the information in this image by using
a trained YOLOv8 model. This section does not ask to implement and train the model from scratch.You can use a well-trained YOLOv8 model from its official implementation.
1. Use your cellphone or a digital camera to capture a street scene in Montréal.
2. Implement the trained YOLOv8 object detection model to identify what are the types of
objects included in the image (such as person, bicycle, vehicle, tree) and count the
number of each object.
3. Display the original and predicted images in your notebook.
(ECSE, 415), Assignment, CIFAR-10, Classification, Convolution, Network, Networks, Neural, Part, points), solved, using
[SOLVED] (ecse 415) assignment 4: neural networks 1 part 1 – cifar-10 classification using convolution neural network (70 points)
$25
File Name: (ecse_415)_assignment_4:_neural_networks_1_part_1_–_cifar-10_classification_using_convolution_neural_network_(70_points).zip
File Size: 1149.24 KB
Only logged in customers who have purchased this product may leave a review.
Reviews
There are no reviews yet.