TotalProf: A Fast and Accurate Retargetable Source Code Profiler
Lei Gao, Jia Huang, Jianjiang Ceng, Rainer Leupers, Gerd Ascheid, and Heinrich Meyr
Abstract:
Profilers play an important role in software/hardware de- sign, optimization, and verification. Various approaches have been proposed to implement profilers. The most widespread approach adopted in the embedded domain is Instruction Set Simulation (ISS) based profiling, which pro- vides uncompromised accuracy but limited execution speed. Source code profilers, on the contrary, are fast but less accu- rate. This paper introduces TotalProf, a fast and accurate source code cross profiler that estimates the performance of an application from three aspects: First, code optimiza- tion and a novel virtual compiler backend are employed to resemble the course of target compilation. Second, an opti- mistic static scheduler is introduced to estimate the behav- ior of the target processor’s datapath. Last but not least, dynamic events, such as cache misses, bus contention and branch prediction failures, are simulated at runtime. With an abstract architecture description, the tool can be easily retargeted in a performance characteristics oriented way to estimate different processor architectures, including DSPs and VLIW machines. Multiple instances of TotalProf can be integrated with SystemC to support heterogeneous Multi- Processor System-on-Chip (MPSoC) profiling. With only about a 5 to 15% error rate introduced to the major per- formance metrics, such as cycle count, memory accesses and cache misses, a more than one Giga-Instruction-Per-Second (GIPS) execution speed is achieved.
Published:
"TotalProf: A Fast and Accurate Retargetable Source Code Profiler"
Lei Gao, Jia Huang, Jianjiang Ceng, Rainer Leupers, Gerd Ascheid, and Heinrich Meyr.
Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
, Grenoble, France, October 2009.
Download:
Paper: