Modern processor architectures provide the possibility to execute an instruction on multiple values at once. So-called SIMD (Single Instruction, Multiple Data) instructions work on packets (or vectors) of data instead of scalar values. They offer a significant performance boost for data-parallel algorithms that perform the same operations on large amounts of data, e.g. data encoding and decoding, image processing, or ray tracing. However, the performance gain comes at a price: programming languages provide no elegant means to exploit SIMD instruction sets. Packet operations have to be coded by hand, which is complicated, unintuitive, and error prone. Thus, packetization—the transformation of scalar code to packet form—is mostly applied automatically by local compiler optimizations (e.g. during loop vectorization) or with a lot of manual effort at performance-critical parts of a system.This thesis describes an algorithm for automatic packetization that allows a programmer to write scalar functions but use them on packets of data. A compiler pass automatically transforms those functions to work on packets of the target-architecture's SIMD width. The resulting packetized function computes the same results as multiple executions of the scalar code.
The algorithm is implemented in a source-language and target-architecture independent intermediate representation (the Low Level Virtual Machine (LLVM)), which enables its use in many different environments.
The performance of the generated code is shown in a real-world case study in the context of real-time ray tracing: serial shader code written in C++ is automatically specialized, optimized, and packetized at runtime. The packetized shaders outperform their scalar counterparts by an average factor of 3.6 on a standard SSE architecture of SIMD width 4.
Automatic Packetization, Ralf Karrenberg.
Masters Thesis, Universität des Saarlandes, July 2009.
@MASTERSTHESIS{Karrenberg:09:MSc, author = {Ralf Karrenberg}, title = {{Automatic Packetization}}, school = {Saarland University}, year = {2009}, month = {July}, webpdf = {http://www.prog.uni-saarland.de/people/karrenberg/content/karrenberg_automatic_packetization.pdf} }