Abstract:
Code layout and instruction prefetch are efficient methods to reduce delays on fetching instructions in processors with instruction cache. To reach this, Code layout adjusts relative space positions of execution codes, while instruction prefetch utilizes relative time relations of execution codes. There are seldom researches on combining them together to get better results by now. On-chip trace is a new debug technique that records the whole program path and time marks non-intrusively with special hardware. It connects space relations and time relations of the code execution, therefore it is possible to support the combination of code layout and instruction prefetch. On the platform of YHFT-DSP with instruction cache and on-chip trace systems, instructions are prefetched according to program phase behaviors taken from the program execution path, and function codes are reordered by prefetch layout to achieve sufficient prefetch intervals. Prefetch operations are executed by idle function units in VLIW DSP or by NOP instructions to reduce overheads. Four benchmarks, Jpeg Encoder, float FFT, Lpc and MPEG4 encoder, are tested to evaluate the novel method. Test results show that this method is able to enhance the prefetching performance and reduce instruction cache misses by exploring the phase stability of program path.