Hint for implementation: Most image processing require using float or double type for image arrays. You can map those images back to integer at the very end of the processing for display purposes, and during this operation you can also optimally scale and shift the result to [0..255] after calculation of minimum and maximum intensities across the resulting image.
1) Convolution and Derivative Filters
Convolution: Implement 2D convolution (denoted *) without the use of built in functions. Your function will take as input two 2D arrays, a filter f and an image I, and return fI. You can handle the boundary of I in any reasonable way of your choice, such as padding with zeros based on the size of the filter f. You are free to make use of any part of your previous indexing implementation.
def convolution(f, I):
# Handle boundary of I, e.g. pad I according to size of f
# Compute im_conv = f*I
return im_conv
Derivative Filters: Using the example image cameraman.png
- Denoise the image with a Gaussian filter. You may use the code from the in class demo to create the Gaussian filter, but you must use your own implementation of convolution.
- Compute derivative images (with respect to x and with respect to y) using the separable derivative filter of your choice, e.g. [-1 0 1] and [-1 0 1]T. You can hardcode the derivative filter, but use your implementation of convolution.
- Compute the gradient magnitude image.
- Create binary edge images from the gradient magnitude image using several thresholds
In your report, introduce how the derivative filter is a discrete approximation of a derivative. Show all results and discuss the outcome of various thresholds.
2) Cross-correlation and Template Matching
The uploaded image multiplekeys.png is a single image containing multiple instances of keys.
Perform the following steps:
- Threshold the original key image so that the background is [0.0] and keys appear as [1.0]. You should have a binary image with only [0.0] (background) and [1.0] (keys).
- Choose your favorite key and crop with a narrow boundary. Use this image as your template.
- Modify your template by setting background pixels to [-1.0], so that you have [+1.0] for the key and [-1.0] for background. The reason for creating a signed template is improved matching performance.
- Implement cross-correlation with the binary input image and your new signed template image.
- Please recall that this results in a peak (maximum) if the template matches the specific image region which you selected for your template.
- Do a pass through the correlation image to detect the maximum peak value and its (x,y) location. You may mark this location with an overlay of a circle or just manually painting an arrow or similar. Discuss if this location matches your expectation. Discuss the appearance and cause of other local peaks, and how they compare to the global peak.
In your report, show the original image, peak image (output of cross-correlation), and your template which was used for correlation.
3) Image Panoramas
An image panorama is created by stitching together several images. The images must be taken under certain criteria: the camera cannot undergo translation between pictures; only rotational transformations are allowed.
If you have ever tried to create a panorama by hand by cutting out pictures and taping them together you know that it doesnt work very well. Translation and rotation are not sufficient to align the two images together. In order for the panorama to look right, one of the images must also be transformed/warped.
A key observation is that images taken under the aforementioned criteria are equivalent under a perspective projection. We can take corresponding points between two images and create a linear system. In general, we choose many correspondences so that the system is overconstrained, and the transformation is found by linear least squares.
The following equations detail how to set up the matrix system:
- Take two photos which overlap and differ only in rotation, e.g. similar to the example images above.
- Implement an image panorama for the case of 2 images (a source and a target). The algorithm is:
- Define corresponding points between the two images (these are pixel coordinates which you can find by hovering your mouse in a paint program, or found in any manner of your choice).
- Using the corresponding points, construct the known matrix and vector shown above. Note that this is similar to our affine example but with a different matrix structure.
- Solve for parameters P with linear least squares (e.g. use np.linalg.lstsq or similar).
- Using the found parameters P, transform the target image. You must implement the transformation yourself, but you may use any code from the demos. You can use any interpolation of your choice.
- Combine the source and transformed target together on the same canvas. You can handle the area of overlap in anyway you choose (simplest is to always overwrite the grey values in the canvas image). Dont worry if your panorama has an obvious seam.
- Compare results using 4, 5, 6, and 7 corresponding points.
Include all results and discussion in your report. Remember your report needs to be a standalone document that someone outside of class can follow.
Reviews
There are no reviews yet.