You are viewing the course site for a past offering of this course. The current offering may be found here.
Lecture 24: High Performance Image Processing & Halide (36)
john-b-yang

The paper here (http://web.cs.ucla.edu/~pouchet/doc/cc-article.11.pdf) discusses data layout transformations in the context of SIMD architectures. I found this outline to be very interesting, primarily because it goes deeper into how the pipeline is sped up by performing vector as opposed to single number math. It also discusses how various reordering optimizations decrease the amount of latency and wait time from one block to the next.

You must be enrolled in the course to comment