Lecture 22: Image Processing (43)
aeave

Multiplication in frequency domain becomes faster when the size of the filter or image we are using gets large.

In spatial domain convolution (assuming our filter is of size N x N and is inseparable, and we have 3 color channels) we have to do 3 * N^2 float additions/multiplications at each of the HEIGHT x WIDTH pixel locations in the image.

In frequency domain multiplication, we only do HEIGHT x WIDTH float multiplications when we multiply the filter's transform with the image's transform (assuming that the image is larger than the filter and we have to pad the filter with zeros to ensure it is the same size as the image). We also have to transform to and from the frequency domain. To get a 2D DFT, we first apply the DFT to every row in the image and replace every row with its transform. Then we apply the DFT to every column in the replaced image. A 1024 x 1024 image would require 2048 1D 1024-dimensional transforms, which are each O(Nlog(N)) (N = 1024 in this case) if we use the FFT algorithm.

So if we have a 1024 x 1024 grayscale image and a 128 x 128 filter, we would do something like O(N^2 * M^2) work in the spatial domain, where N is the size of the filter (128), and M is the size of the image (1024). If we were to instead transform both the filter and the image to the frequency domain, we would do something like O(2*M * Mlog(M) + M^2), which equivalently is O(M^2 * log(M)), where M is the size of the image (assuming that it is larger in size than the filter) (also assuming we pad the filter with zeros so it is the same size as the image before we transform it to frequency domain). Remember that we have to perform the 2D DFT once for the image and once for the filter, perform M^2 multiplications between the DFT of the padded filter and the DFT of the image, and then O(M * log(M)) work to perform the inverse DFT once on the multiplied frequency transform using the inverse FFT algorithm. Again, this is equivalently O(M^2 * log(M)) after we ignore the asymptotic irrelevance of the element-wise multiplications we do in between the DFT and inverse DFT computation.

So we have O(M^2 * log(M)) for frequency domain multiplication and O(N^2 * M^2) for spatial domain convolution where N is the size of the filter and M is the size of the image. You can see that the spatial domain convolution grows faster if we increase the size of the filter.

tbh I'm not really sure if I got any of this right but yeah. TLDR: frequency domain multiplication becomes a better option the bigger you make the filter you are convolving with (I think).

henrykhaung

I think that makes sense. If you have a bigger kernel size, it would be better to do multiplication in the frequency domain and similarlly, if you have a smaller kernel size, it would be better to do it in spatial domain. I would also like to add that if you're doing more than filtering ie filter is one of the many steps in the processing pipeline, it would be better to do it in frequency domain as it is more efficient.

You must be enrolled in the course to comment