As both programs and machines are becoming more complex, writing high performance codes is an increasingly difficult task. In order to bridge the gap between the compiled-code and peak performance, resorting to domain or architecture-specific libraries has become compulsory. However, deciding when and where to use a library function must be specified by the programmer. This partition between library and user code is not questioned by the compiler although it has great impact on performance. We propose in this paper a new method that helps the user find in its application all code fragments that can be replaced with library calls. The same technique can be used to change or fusion multiple calls into more efficient ones. The results of the alternative detection of BLAS 1 and 2 in SPEC are presented.
"Deciding Where to Call Performance Libraries"
By C. Alias and D. Barthou
Proceedings of the International IEEE Euro-Par Conference, August, 2005