Cycle-approximate Retargetable Performance Estimation at the Transaction Level
Y. Hwang, S. Abdi, and D. Gajski
Abstract:
This paper presents a novel cycle-approximate performance estimation technique for automatically generated
transaction level models (TLMs) for heterogeneous multicore designs. The inputs are application C processes and
their mapping to processing units in the platform. The processing unit model consists of pipelined datapath, memory
hierarchy and branch delay model. Using the processing
unit model, the basic blocks in the C processes are analyzed
and annotated with estimated delays. This is followed by
a code generation phase where delay-annotated C code is
generated and linked with a SystemC wrapper consisting of
inter-process communication channels. The generated TLM
is compiled and executed natively on the host machine. Our
key contribution is that the estimation technique is close to
cycle-accurate, it can be applied to any multi-core platform
and it produces high-speed native compiled TLMs. For experiments, timed TLMs for industrial scale designs such as
MP3 decoder were automatically generated for 4 heterogeneous multi-processor platforms with up to 5 PEs under
1 minute. Each TLM simulated under 1 second, compared
to 3-4 hrs of instruction set simulation (ISS) and 15-18 hrs
of RTL simulation. Comparison to on-board measurement
showed only 8% error on average in estimated number of
cycles.
Published:
"Cycle-approximate Retargetable Performance Estimation at the Transaction Level"
Y. Hwang, S. Abdi, and D. Gajski
Proc. of Design Automation and Test in Europe (DATE'08), Munich, Germany, March 2008
Download:
Paper: