1 VLFeat Installation
One of key skills to learn in computer vision is the ability to use other open source code, which allow you not to re-invent the wheel. We will use VLFeat by A. Vedaldi and B. Fulkerson (2008) for SIFT extraction given your images. Install VLFeat from following:
http://www.vlfeat.org/install-matlab.html
Run vl demo sift basic to double check the installation is completed.
(NOTE) You will use this library only for SIFT feature extraction and its visualization. All following visualizations and algorithms must be done by your code.
2 SIFT Feature Extraction
Figure 1: Given your two cellphone images (left and right), you will extract SIFT descriptors and visualize them using VLFeat.
You will extract David Lowes SIFT (Scale Invariant Feature Transform) features from your cellphone images as shown in Figure 1. First, take a pair of pictures with your calibrated camera (intrinsic parameter, K, and radial distortion parameter, k, are precalibrated.) as follow:
- Take the first picture and another one after moving one step right (1m).
- Common 3D objects across different depths, e,g., buildings, ground plane, and tree, appear in both images.
- Two images have to be similar enough so that the appearance based image matching can be applied, i.e., SIFT feature does not break, while maintaining the baseline between cameras (at least 1 m), i.e., similar camera orientation and sufficient translation.
- Avoid a 3D scene dominated by one planar surface, e.g., looking at ground plane.
Write-up:
(SIFT visualization) Use VLFeat to visualize SIFT features with scale and orientation as shown in Figure 1. You may want to plot up to 500 feature points. You may want to follow the following tutorial:
http://www.vlfeat.org/overview/sift.html
3 SIFT Feature Matching
Figure 2: You will match points between I1 and I2 using SIFT features.
(NOTE) From this point, you cannot use any function provided by VLFeat.
The SIFT is composed of scale, orientation, and 128 dimensional local feature descriptor (integer), f Z128. You will use the SIFT features to match between two images, I1 and I2.
Write-up:
- (Nearest neighbor search) Let two sets of features be {f1, ,fN1} from I1 and {g1, ,gN2} from I2 where N1 and N2 are the number of features in image 1 and 2, respectively. Compute nearest neighbor per feature and visualize ({f1, ,fN1} {g1, ,gN2} and {g1, ,gN2} {f1, ,fN1}) as shown in Figure 2(a) and Figure 2(b). Note that the distance between two features is defined as d = kf gk. You may use knnsearch function in MATLAB.
- (Ratio test) Filter out matches using the ratio test, i.e., keep the match if dij1/dij2 < 0.7 and discard otherwise, where dij1 and dij2 are the first and second nearest neighbors for the ith feature, respectively. Visualize the matches after the ratio test as shown Figure 2(d) and Figure 2(c).
- (Bidirectional match) Visualize bidirectionally consistent matches as shown Figure 2(e). Compare the number of matches from (1) to (3).
4 Fundamental matrix
Figure 3: You will visualize epipole and epipolar lines for each image.
Compute a fundamental matrix between I1 and I2.
Write-up:
(1) (Fundamental matrix) Complete the following function to compute a fundamental matrix, linearly:
F = ComputeFundamentalMatrix(u, v)
Input: u and v are Nf 2 matrices of 2D correspondences where the Nf is the number of 2D correspondences, u ↔ v.
Output: F R33 is a rank 2 fundamental matrix.
(2) (Epipole and epipolar line) Pick 8 random correspondences, ur ↔ vr, compute the fundamental matrix, Fr and visualize epipole and epipolar lines for the rest of feature points in both images as shown in Figure 3. Pick different sets of correspondences and visualize different epipolar lines.
5 Robust Fundamental Matrix Estimation
Estimate the fundamental matrix using RANSAC.
Write-up:
(1) (RANSAC with fundamental matrix) Write a RANSAC algorithm for the fundamental matrix estimation given N matches from Section 4 using the following pseudo code:
Algorithm 1 GetInliersRANSAC
1: n 0
2: for i = 1 : M do
3: Choose 8 correspondences, ur and vr, randomly from u and v.
4: Fr = ComputeFundamentalMatrix(ur, vr)
5: Compute the number of inliers, nr, with respect to F.
6: if nr > n then
7: n nr 8: F = Fr
9: end if
10: end for
- (Epipole and epipolar line) Using the fundamental matrix, visualize epipole and epipolar lines.
-0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2
Figure 4: Four configurations of camera pose from a fundamental matrix.
- (Camera pose estimation) Compute 4 configurations of relative camera poses:
[R1 C1 R2 C2 R3 C3 R4 C4] = CameraPose(F, K)
Input: F is the fundamental matrix and K is the intrinsic parameter.
Output: R1 C1 R4 C4 are rotation and camera center (represented in the world coordinate system).
- Visualize the 4 configuration in 3D as shown in Figure 4.
6 Triangulation
Given four configurations of relative camera pose, you will find the best camera pose by verifying through 3D point triangulation.
Write-up:
(1) (Linear triangulation) Write a code that computes the 3D point given the correspondence, u ↔ v, and two camera projection matrices:
[X] = LinearTriangulation(P1,u,P2,v)
Input: P1,P2 R34 are two camera projection matrices, and u ↔ v R2 are their 2D correspondence.
Output: X R3 is the triangulated 3D point.
Hint: Use the triangulation method by linear solve, i.e.,
- (Cheirality) Write a code that computes the number of 3D points in front of two cameras. The condition of a 3D point being in front of camera is called cheirality: idx = CheckCheirality(Y,C1,R1,C2,R2)
Input: Y is a n 3 matrix that includes n 3D points, and C1,R1/C2,R2 are the first and second camera poses (camera center and rotation).
Output: idx is a set of indices of 3D points that satisfy cheirality.
Hint: the point must satisfy r3(X C) > 0 where r3 is the 3rd row of the rotation matrix (z-axis of the camera), C is the camera position and X is the 3D point.
- (Camera pose disambiguation) Based on cheirality, find the correct camera pose.
Visualize 3D camera pose and 3D points together as shown in Figure 5. Hint: Use plot3 and DisplayCamera.m to visualize them.
Figure 5: You will visualize four camera pose configurations with point cloud.
- (Reprojection) Project 3D points to each camera and visualize reprojection, u and b
measurement, u onto the image as shown in Figure 6, i.e.,
KR I
where X is the reconstructed point.
Figure 6: Visualization of measurements and reprojection.
Reviews
There are no reviews yet.