Real-time Object Detection and
Classification
Submission via Blackboard
Design and implement a computer vision model pipeline and perform matching to identify events from a given video stream. Figure 1 shows the high-level block diagram architecture of the pipeline. The pipeline consists of a video reader and a model cascade. The task is to detect object events (car) with a specific attribute (car type). The pipeline can be developed in Python (highly recommended) or Java [1].
Video: The given video clip (duration 30 seconds) is of a road cross-section.
Pipeline Design: Implement a pipeline that processes the video frames using a DNN model(s).
Pipeline Input: Stream the video stream into the system by using a video reader (fig. 1) at a rate of 30 frames per second. This requires reading the video file and simulating a stream of image frames as individual data items.
Model Cascade: The model cascade will be a 2-stage computer vision pipeline. The following steps need to be performed to build the model cascade:
Stage 1: Object Detector: Deploy a state-of-the-art object detector model (TinyYolo), pre-trained on PASCAL VOC or MSCOCO dataset. Tiny YOLO model and weights can be downloaded here: https://github.com/qqwweee/keras-yolo3/ https://pjreddie.com/darknet/yolo/
- Stage2- Attribute Classifier (Car Type): Train an existing pretrained object classifier (Mobilenet from Keras library: https://keras.io/applications/ or DeepLearning 4j for Java) using a transfer learning approach for two car classes:
SUV
Sedan
The input to this attribute classifier will be the Region of Interest obtained from Stage-1.
- Dataset Preparation for Stage 2: Students need to create a small dataset of images to train the classifier. The more images, the better the classifier will be trained. Its recommended to download at least 400 images (SUV and Sedan) (scrape it from google using any image downloading api) for each class and divide the dataset into 80:20 ratio] for training and testing. Split the training set into 80:20 ratio for training and validation set.
Queries: Two queries need to execute after deploying the model pipeline. The queries are lined with their increasing complexities.
- Q1 [Object Detection Task]: Identify and count the number of cars in each frame.
- Q2 [2-Stage Model Cascade- (Object Detection Attribute Classifier (Car Type))]:
Identify and count the cars and their types in each frame.
Optimisation: Implement a processing flow optimization of the pipeline and the model cascade for Q1 and Q2. One option is to use a producer consumer, where the producer thread continuously streams video frames and sends it to an internal queue from where the consumer (DNN models) read the frames from the queue and process them.
Output and Results Evaluation: Record your results in the format given below and compare your results to the ground truth.
For Q1 and Q2 do the following:
- Compare the count with ground truth and report the accuracy (F1 Score).
- Report the training accuracy of the car type classifier.
- Report the throughput of Q1 and Q2.For the evaluation, the experiment section of VidCEP paper can be useful to consult.
- Report on the effects of your flow optimisation for Q1 and Q2.
Code (50 Marks)
- Pipeline Design
- Dataset Preparation (download 500 images of each class)
- Preprocessing
- Deploying the pre-trained SOTA Object Detector [TinyYolo]
- Car Type Classifiers
Model Cascade Deployment
- Pipeline Optimisation
- Comments to explain your source code. Insufficient comments will lead to mark deductions.
Report (50 Marks)
You are required to submit a short report (approximately 4-5 pages including references) specifying:
- Pipeline Design: A description of your approach and design decisions. This should include details on the flow of the pipeline, details of video reader, object detector and classifier. Justify your design decisions. Where appropriate include appropriate diagrams and references to research papers to support your arguments.
- Pipeline Output: Detail the output of your pipeline with all the graph results (event extraction, throughput etc).
- Model Training Evaluation: Description about how the classifier model is trained which includes dataset preparation to transfer learning approach and training process. Report the accuracy of the trained model.
- Pipeline Optimisation: A description of how you have optimized the execution of the pipeline and the Model Cascade. Report the improvement of the optimised pipeline. Justify your design decisions.
- Design Strengths and Weaknesses: Detail the Strengthens and Weaknesses of your approach Justify your design decisions. Where appropriate include references to research papers to support your arguments. (minimum 1 Page)
- Download Link for Submission Files: Provide a download link for the ZIP archive containing your demo video, code and the CSV file with complete results.
Groups and Individual
The assignment may be completed as a group. Each group can have a maximum of 2 students. Groups are self-selected. Please email your group ( students name) at [email protected] by the end of day Friday 19th Feb. If you cannot form a group or have any other doubt, please connect with Jaleed at the above mail.
There is also the option to complete the assignment as an individual. Please email [email protected] to confirm you will complete it as an individual by the end of day Friday 19th Feb. For students who have not contacted Jaleed, we assume you will complete it as an individual.
Grading will be appropriate for individual and group submissions.
- s. Mistakes cannot be corrected once the deadline has passed.
[1]
For java users: https://deeplearning4j.org/ library consists of TinyYolo and Mobilenet model.
[1] For java users: https://deeplearning4j.org/ library consists of TinyYolo and Mobilenet model.
Reviews
There are no reviews yet.