Lecture 4: Transforms (125)

Some of the perspective distortion evident in the image on the left (image a) can actually be corrected using a calibrated image sensor map. The way this works is that a camera lens + sensor are characterized by their distortion to a black and white grid of known size, and then the inverse pixel transform is learned/analytically computed to invert the perspective distortion (such as fisheye distortion) and quazi-normalize the captured image with minimal artifacts. This is used quite a bit in 3d-printing, cnc, and in general, any scientific method that requires the use of a camera as a sensor (even in computer vision to try and standardize inputs into deep learning frameworks that are sensitive to out-of-distribution input samples).


Yes, it's amazing what can be done computationally to compensate for a imperfect lens! However, the perspective distortion in image A appears because the camera's sensor plane was not parallel to the building fronts.


Because camera is not pointing perpendicular to the buildings, the xy plane have a vanishing point for image a. Furthermore, we see that in the z direction, we have another vanishing point -- the left and rightmost building converges to a point somewhere in them middle. By looking at how the lines go to the focal point, we can determine if the buildings are parallel with one another.

