Abstract:
Nowadays, due to the rapid development of artificial intelligence, the memristor-based processing in memory (PIM) architecture for neural network (NN) attracts a lot of researchers’ interests since it performs much better than traditional von Neumann architecture. Equipped with the peripheral circuit to support function units, memristor arrays can process a forward propagation with higher parallelism and much less data movement than that in CPU and GPU. However, the hardware of the memristor-based PIM suffers from the large area overhead of peripheral circuit outside the memristor array and non-trivial under-utilization of function units. This paper proposes a 3D memristor array based PIM architecture for NNs (FMC) by gathering the peripheral circuit of function units into a function pool for sharing among memristor arrays that pile up on the pool. We also propose a data mapping scheme for the 3D memristor array based PIM architecture to further increase the utilization of function units and reduce the data transmission among different cubes. The software-hardware co-design for the 3D memristor array based PIM not only makes the most of function units but also shortens the wire interconnections for better high-performance and energy-efficient data transmission. Experiments show that when training a single neural network, our proposed FMC can achieve up to 43.33 times utilization of the function units and can achieve up to 58.51 times utilization of the function units when training multiple neural networks. At the same time, compared with the 2D-PIM which has the same amount of compute array and storage array, FMC only occupies 42.89% area of 2D-PIM. What’s more, FMC has 1.5 times speedup and 1.7 times energy saving compared with 2D-PIM.