In this homework you will experiment with SIFT features for scene matching and object recognition. You will work with the SIFT tutorial and code from the University of Toronto. In the compressed homework file, you will find the tutorial document (tutSIFT04.pdf) and a paper from the International Journal of Computer Vision (ijcv04.pdf) describing SIFT and object recognition. Although the tutorial document assumes matlab implemention, you should still be able to follow the technical details in it. In addition, you are STRONGLY encouraged to read this paper unless youre already quite familiar with matching and recognition using SIFT.
There are 3 problems in this homework with a total of 100 points. Two bonus questions with extra 5 and 15 points are provided under problem 1 and 2 respectively. The maximum points you may earn from this homework is 100 + 20 = 120 points. Be sure to read Submission Guidelines below. They are important.
Using SIFT in OpenCV 3.x.x in Local Machine
Feature descriptors like SIFT and SURF are no longer included in OpenCV since version 3. This section provides instructions on how to use SIFT for those who use OpenCV 3.x.x. If you are using OpenCV 2.x.x then you are all set, please skip this section. Read this if you are curious about why SIFT is removed https://www.pyimagesearch.com/2015/07/16/where-did-sift-and-surf-go-in-opencv-3/ (https://www.pyimagesearch.com/2015/07/16/where-did-sift-and-surf-go-in-opencv-3/).
We strongly recommend you to use SIFT methods in Colab for this homework, the details will be described in the next section.
However, if you want to use SIFT in your local machine, one simple way to use the OpenCV in-built function SIFT is to switch back to version 2.x.x, but if you want to keep using OpenCV 3.x.x, do the following:
- uninstall your original OpenCV package
- install opencv-contrib-python using pip (pip is a Python tool for installing packages written in Python), please find detailed instructions at https://pypi.pyorg/pypi/opencv-contrib-python
(https://pypi.python.org/pypi/opencv-contrib-python)
After you have your OpenCV set up, you should be able to use cv2.xfeatures2d.SIFT_create() to create a
SIFT object, whose functions are listed at http://docs.opencv.org/3.0beta/modules/xfeatures2d/doc/nonfree_features.html (http://docs.opencv.org/3.0beta/modules/xfeatures2d/doc/nonfree_features.html)
Using SIFT in OpenCV 3.x.x in Colab (RECOMMENDED)
The default version of OpenCV in Colab is 3.4.3. If we use SIFT method directly, typically we will get this error message:
error: OpenCV(3.4.3) /io/opencv_contrib/modules/xfeatures2d/src/sift.cpp:1207: erro r: (-213:The function/feature is not implemented) This algorithm is patented and is excluded in this configuration; Set OPENCV_ENABLE_NONFREE CMake option and rebuild the library in function create
One simple way to use the OpenCV in-built function SIFT in Colab is to switch the version to the one from contrib. Below is an example of switching OpenCV version:
- Run the following command in one section in Colab, which has already been included in this assignment:
pip install opencv-contrib-python==3.4.2.16
- Restart runtime by
Runtime -> Restart Runtime
Then you should be able to use use cv2.xfeatures2d.SIFT_create() to create a SIFT object, whose functions are listed at http://docs.opencv.org/3.0-beta/modules/xfeatures2d/doc/nonfree_features.html (http://docs.opencv.org/3.0-beta/modules/xfeatures2d/doc/nonfree_features.html)
Some Resources
In addition to the tutorial document, the following resources can definitely help you in this homework:
http://opencv-pythontutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_matcher/py_matcher.html (http://opencvpython-tutroals.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_matcher/py_matcher.html) http://docs.opencv.org/3.1.0/da/df5/tutorial_py_sift_intro.html (http://docs.opencv.org/3.1.0/da/df5/tutorial_py_sift_intro.html) http://docs.opencv.org/3.0-beta/modules/xfeatures2d/doc/nonfree_features.html?highlight=sift#cv2.SIFT (http://docs.opencv.org/3.0-beta/modules/xfeatures2d/doc/nonfree_features.html?highlight=sift#cv2.SIFT) http://docs.opencv.org/3.0beta/doc/py_tutorials/py_imgproc/py_geometric_transformations/py_geometric_transformations.html
(http://docs.opencv.org/3.0beta/doc/py_tutorials/py_imgproc/py_geometric_transformations/py_geometric_transformations.html)
Problem 1: Match transformed images using SIFT features
{40 points + bonus 5} You will transform a given image, and match it back to the original image using SIFT keypoints.
Step 1 (5pt). Use the function from SIFT class to detect keypoints from the given image. Plot the image with keypoints scale and orientation overlaid.
Step 2 (10pt). Rotate your image clockwise by 60 degrees with the cv2.warpAffine function. Extract SIFT keypoints for this rotated image and plot the rotated picture with keypoints scale and orientation overlaid just as in step 1.
Step 3 (15pt). Match the SIFT keypoints of the original image and the rotated imag using the knnMatch function in the cv2.BFMatcher class. Discard bad matches using the ratio test proposed by D.Lowe in the SIFT paper. Use 0.1 as the ratio in this homework. Note that this is for display purpose only. Draw the filtered good keypoint matches on the image and display it. The image you draw should have two images side by side with matching lines across them.
Step 4 (10pt). Use the RANSAC algorithm to find the affine transformation from the rotated image to the original image. You are not required to implement the RANSAC algorithm yourself, instead you could use the cv2.findHomography function (set the 3rd parameter method to cv2.RANSAC ) to compute the transformation matrix. Transform the rotated image back using this matrix and the cv2.warpPerspective function. Display the recovered image.
Bonus (5pt). You might have noticed that the rotated image from step 2 is cropped. Rotate the image without any cropping and you will be awarded an extra 5 points.
Hints: In case of too many matches in the output image, use the ratio of 0.1 to filter matches.
Problem 2: Scene stitching with SIFT features
{30 points + 15 bonus} You will match and align between different views of a scene with SIFT features.
Use cv2.copyMakeBorder function to pad the center image with zeros into a larger size. Hint: the final output image should be of size 16081312. Extract SIFT features for all images and go through the same procedures as you did in problem 1. Your goal is to find the affine transformation between the two images and then align one of your images to the other using cv2.warpPerspective . Use the cv2.addWeighted function to blend the aligned images and show the stitched result. Examples can be found at http://docs.opencv.org/trunk/d0/d86/tutorial_py_image_arithmetics.html
(http://docs.opencv.org/trunk/d0/d86/tutorial_py_image_arithmetics.html). Use parameters 0.5 and 0.5 for alpha blending.
Step 1 (15pt). Compute the transformation from the right image to the center image. Warp the right image with the computed transformation. Stitch the center and right images with alpha blending. Display the SIFT feature matching between the center and right images like you did in problem 1. Display the stitched result (center and right image).
Step 2 (15pt) Compute the transformation from the left image to the stitched image from step 1. Warp the left image with the computed transformation. Stich the left and result images from step 1 with alpha blending. Display the SIFT feature matching between the result image from step 1 and the left image like what you did in problem 1. Display the final stitched result (all three images).
Bonus (15pt). Instead of using cv2.addWeighted to do the blending, implement Laplacian Pyramids to blend the two aligned images. Tutorials can be found at http://opencv-pythontutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_pyramids/py_pyramids.html (http://opencvpython-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_pyramids/py_pyramids.html). Display the stitched result (center and right image) and the final stitched result (all three images) with laplacian blending instead of alpha blending.
Note that for the resultant stitched image, some might have different intensity in the overlapping and other regions, namely the overlapping region looks brighter or darker than others. To get full credit, the final image should have uniform illumination.
Hints: You need to find the warping matrix between images with the same mechanism from problem 1. You will need as many reliable matches as possible to find a good homography so DO NOT use 0.1 here. A suggested value would be 0.75 in this case.
When you warp the image with cv2.warpPerspective, an important trick is to pass in the correct parameters so that the warped image has the same size with the padded_center image. Once you have two images with the same size, find the overlapping part and do the blending.
Problem 3: Object Recognition with HOG features
{30 points} You will use the histogram of oriented gradients (HOG) to extract features from objects and recognize them.
HOG decomposes an image into multiple cells, computes the direction of the gradients for all pixels in each cell, and creates a histogram of gradient orientation for that cell. Object recognition with HOG is usually done by extracting HOG features from a training set of images, learning a support vector machine (SVM) from those features, and then testing a new image with the SVM to determine the existence of an object.
You can use cv2.HOGDescriptor to extract the HoG feature and cv2.ml.SVM_create for SVMs (and a lot of other algorithms). You can also use Python machine learning packages for SVM, e.g. scikit-learn and for HoG computation, e.g. scikit-image . Please find the OpenCV SVM tutorial at https://www.learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/ (https://www.learnopencv.com/handwritten-digits-classification-an-opencv-c-python-tutorial/).
An image set located under SourceImages/human_vs_birds is provided containing 20 images. You will first train an SVM with the HoG features and then predict the class of an image with the trained SVM. For simplicity, we will be dealing with a binary classification problem with two classes, namely, birds and humans. There are 10 images for each class.
Some of the function names and arguments are provided, you may change them as you see fit.
Step 1 (5pts). Load in the images and create a vector of corresponding labels (0 for bird and 1 for human). An example label vector should be something like [1,1,1,1,1,0,0,0,0,0]. Shuffle the images randomly and display them in a 2 x 10 grid with figsize = (18, 15).
Step 2 (10pts). Extract HoG features from all images. You can use the OpenCV function cv2.HOGDescriptor or hog routine from scikit-image . Display the HoG features for all images in a 2 x
10 grid with figsize = (18, 15).
Step 3. Use the first 16 examples from the shuffled dataset as training data on which to train an SVM. The rest 4 are used as test data. Reshape the HoG feature matrix as necessary to feed into the SVM. Train the classifier. DO NOT train with test data. No output is expected from this part.
Step 4 (15pts). Perform predictions with your trained SVM on the test data. Output a vector of predictions, a vector of ground truth labels, and prediction accuracy
Reviews
There are no reviews yet.