Abstract:
The rapid development of artificial intelligence application, such as neural network, image recognition and test recognition, brings huge challenges to traditional processors. Coarse-grained dataflow architectures become hotspot for AI application because it possesses the characteristic of high instruction-level parallelism. At the same time, it remains broadly applicable and adaptable. However, with processing elements of coarse-dataflow adapt random access memory as memory, combined with the property of intensive memory requirement of neural networks, there are lots of memory access conflicts in inner-inst. After analyzing the memory access behavior of AI applications, it is found that there are a large number of inner-inst memory access conflicts which greatly degrade the utilization of computing units. Based on this observation, in dataflow processors, a flexible data redundancy strategy (FRS) for inner-inst memory access conflict is proposed to allocate multi-storage for operand access requests which induce conflicts in inner-inst during compile stage. By using FRS, the number of conflicts in the RAM is effectively degraded. We use typical AI application benchmarks in the experiments, such as LeNet, AlexNet. The experimental results show that FRS improves power efficiency by 30.21% and 12.37% compared with Round-Robin none-data redundancy strategy and Re-Hash none-data redundancy strategy, and by 27.95% compared with 2 multi-data redundancy strategy.