Intro to Image Understanding Assignment 4 Solution

Starting from:

~~$30~~

$24

Instructions for submission: Please write a document in a pdf or doc file with your solutions (include images where needed), and submit through MarkUS. Please include your code and specify the question to which it corresponds.

Stereo Matching Costs

[0.5 point] Imagine that two images for stereo matching are captured by the same camera but under different exposure times. Compare the two main cost functions for matching, i.e. SSD and NC, for estimating the correspondences in these two images. Which one performs better and why?

Stereo Matching Implementation - for this question, use the rectified image pair (000020_left.jpg and 000020_right.jpg), the given bounding box in file 000020.txt, and the given parameters in file 000020_allcalib.txt.

[2 points] Write a program to compute the depth for each pixel in the given bounding box of car. Use the algorithm given in class, where given a left patch, compare it with all the patches on the right image’s scanline. To reduce computation complexity, you can try to use a small patch size, or sample patches (e.g. every other pixel) from scanline instead of comparing with all possible patches. Report what is the patch size, sampling method, and matching cost function you used. Use the parameters given and show how depth is computed for each pixel. Also visualize the depth information. Are there any outliers coming from incorrect point correspondences?

[2 points] After you compute depth using scanline (above), go to KITTI Stereo 2015 (http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=stereo), and pick a machine learning model from the scoreboard. You can pick a model that comes with code and a pre-trained model so you don’t need to implement yourself. Compute the depth for the whole image using their pretrained model. Compare your results from the previous questions with this results from this model. What is the difference, both in terms of quality and speed?

[1 point] Write a short summary of workflow for the model you pick, e.g. what layers do they use, what modules do they use, etc.

(d) [1.5 points] Using the depth information from models, try to determine within the bounding box, which pixel belong to the car based on its distance to the box center pixel’s 3D location (use a threshold). After that, try to determine a 3D bounding box for the car (min & max along X, Y, Z). Visualize the segmentation of pixels within the 2D box, and on another image visualize the 3D box. State the classification threshold for

distance.

3. Fundamental Matrix - for this question, take 3 images (I, I, I) of an object or a stationary

1 2 3

scene as follows: Images I, I from almost the same viewpoint, but with a (roll) rotation. That is,

1 2

take an image (I), don’t move, but rotate the camera in place around 30-45 degrees, and take

1

another image (I). Take the third image (I) from a different viewpoint, e.g. move the camera

2 3

~20 cm to the right and rotate (out of plane) to point the camera towards the object again.

I and I:

1 2

I1 and I2 (top view):

(a) [1 points] Use SIFT matching (or any other point matching technique) to find a number

of point correspondences in the (I1
, I2) image pair and in the (I1
, I3) image pair. Visualize
the results. If there are any outliers, either manually remove them or increase the matching threshold so no outliers remain. Pick 8 point correspondences from the

remaining set for each image pair, i.e. (I1
, I2) and (I1
, I3). Visualize those 8 point matches.

It helps in the later steps if the 8 point matches are somewhat distributed over the images rather than being clustered in a small region.

(b) [1 point] Using what we have learned in class (standard 8-point algorithm) calculate the fundamental matrix F for image pair (I , I) and Fundamental matrix F for image pair

12 1 2 13

(I , I). (for this question, implement your own standard 8-point algorithm)

1 3

(c) [1 point] Using F , calculate the epipolar lines in the right image for each of the 8 points

12

in the left image and plot them on the right image. (for this question you can use any OpenCV functions you want)

(d) [1 point] Using F12
, rectify I2 with I1 and visualize the resulting image side by side with I1.

Do the same for I3
using F13
. (for this question you can use any OpenCV functions you

want)

(e)
[0.5 point] Using OpenCV, compute F’12
and F’13 and compare with your results. I.e. are

they same? Are they similar? Briefly discuss.

(f)
[0.5 point] Using OpenCV, rectify the images using F’12
and F’13 and compare with your
rectifications (part d). Discuss any differences.