COMS_E6998_010_Homework_5_Fall_2020
Problem 1 SSD, ONNX model, Visualization, Inferencing
In this problem we will be inferencing SSD ONNX model using ONNX Runtime Server. You will follow the github repo and ONNX tutorials (links provided below). You will start with a pretrained Pytorch SSD model and retrain it for your target categories. Then you will convert this Pytorch model to ONNX and deploy it on ONNX runtime server for inferencing.
- Download pretrained pytorch MobilenetV1 SSD and test it locally using Pascal VOC 2007 dataset. Show the test accuracy for the 20 classes. (4)
- Select any two related categories from Google Open Images dataset and finetune the pretrained SSD model. Examples include, Aircraft and Aeroplane, Handgun and Shotgun. You can use py script provided at the github to download the data. For finetuning you can use the same parameters as in the tutorial below. Compute the accuracy of the test data for these categories before and after finetuning. (5+5)
- Convert the Pytorch model to ONNX format and save it. (4)
- Visualize the model using net drawer tool. Compile the model using embed_docstring flag and show the visualization output. Also show doc string (stack trace for PyTorch) for different types of nodes.
(6)
- Deploy the ONNX model on ONNX runtime (ORT) server. You need to set up the environment following steps listed in the tutorial. Then you need make HTTP request to the ORT server. Test the inferencing set-up using 1 image from each of the two selected categories. (6)
- Parse the response message from the ORT server and annotate the two images. Show inferencing output (bounding boxes with labels) for the two images. (5)
For part 1, 2, and 3, refer to the steps in the github repo. For part 4 refer to ONNX tutorial on visualizing and for 5 and 6 refer to ONNX tutorial on inferencing.
References Github repo. Shot MultiBox Detector Implementation in Pytorch. Available at https://github.com/qfgaohao/pytorch-ssd ONNX tutorial. Visualizing an ONNX Model.
Available at https://github.com/onnx/tutorials/blob/master/tutorials/VisualizingAModel.md ONNX tutorial. Inferencing SSD ONNX model using ONNX Runtime Server.
Available at https://github.com/onnx/tutorials/blob/master/tutorials/OnnxRuntimeServerSSDModel.ipynb Google. Open Images Dataset V5 + Extensions.
Available at https://storage.googleapis.com/openimages/web/index.html The PASCAL Visual Object Classes Challenge 2007.
Available at http://host.robots.ox.ac.uk/pascal/VOC/voc2007/
Problem 2 ML Cloud Platforms
In this question you will analyze different ML cloud platforms and compare their service offerings. In particular, you will consider ML cloud offerings from IBM, Google, Microsoft, and Amazon and compare them on the basis of following criteria:
- Frameworks: DL framework(s) supported and their version. (4)
Here we are referring to machine learning platforms which have their own inbuilt images for different frameworks.
- Compute units: type(s) of compute units offered, i.e., GPU types. (2)
- Model lifecycle management: tools supported to manage ML model lifecycle. (2)
- Monitoring: availability of application logs and resource (GPU, CPU, memory) usage monitoring data to the user. (2)
- Visualization during training: performance metrics like accuracy and throughput (2)
- Elastic Scaling: support for elastic scaling compute resources of an ongoing job. (2)
- Training job description: training job description file format. Show how the same training job is specified in different ML platforms. Identify similar fields in the training job file for the 4 ML platforms through an example. (6)
Problem 3 Kubeflow, MiniKF, Kale
In this problem we will follow Kubeflow-Kale codelab (link below). You will follow the steps as outlined in the codelab to install Kubeflow with MiniKF, convert a Jupyter Notebook to Kubeflow Pipelines, and run Kubeflow Pipelines from inside a Notebook. For each step below you need to show the commands executed, terminal output, and screenshot of visual output (if any). You also need to give a new name to your GCP project and any resource instance you create, e.g., put your initial in the name string.
- Setting up the environment and installing MiniKF: Follow the steps in the codelab to:
- Set up a GCP project. (2)
- Install MiniKF and deploy your MinKF instance. (3)
- Login to MiniKF, Kubeflow, and Rok. (3)
- Run a Pipeline from inside your Notebook: Follow the steps in the codelab to:
- Create a notebook server. (3)
- Download and run the notebook: We will be using pytorch-classification notbeook from the example repo. Note that the codelab uses a different example from the repo (titanic dataset ml.ipynb). (4)
- Convert your notebook to a Kubeflow Pipeline: Enable Kale and then compile and run the pipeline from Kale Deployment Panel. Show output from each of the 5 steps of the pipeline (5)
- Show snapshots of Graph and Run output of the experiment. (4)
- Cleanup: Destroy the MiniKF VM. (1)
References Codelab. From Notebook to Kubeflow Pipelines with MiniKF and Kale.
Available at https://codelabs.developers.google.com/codelabs/cloud-kubeflow-minikf-kale
Problem 4 Deep Reinforcement Learning
This question is based on Deep RL concepts discussed in Lecture 8. You need to refer to the papers by Mnih et al., Nair et al., and Horgan et al. to answer this question. All papers are linked below.
- Explain the difference between episodic and continuous tasks? Given an example of each. (2)
- What do the terms exploration and exploitation mean in RL ? Why do the actors employ -greedy policy for selecting actions at each step? Should remain fixed or follow a schedule during Deep RL training ? How does the value of help balance exploration and exploitation during training. (1+1+1+1)
- How is the Deep Q-Learning algorithm different from Q-learning ? You will follow the steps of Deep Q-Learning algorithm in Mnih et al. (2013) page 5, and explain each step in your own words. (3)
- What is the benefit of having a target Q-network ? (3)
- How does experience replay help in efficient Q-learning ? (3)
- What is prioritized experience replay ? (2)
- Compare and contrast GORILA (General Reinforcement Learning Architecture) and Ape-X architecture. Provide three similarities and three differences. (3)
Reviews
There are no reviews yet.