Abstract:
Single particle reconstruction is one of the most important technologies for determining three-dimensional structures of macromolecules. In recent years, it has been given more and more attention, because of some of its distinct features. Unfortunately, its application is greatly constrained, due to its extremely long processing time and lack of efficient parallel implementations. This study optimizes and parallelizes one of the most widely-used software packages for single particle reconstruction: EMAN. By analyzing algorithms of its major components, the authors find that the key problem is achieving ideal load balancing with low communication costs. A self-adaptive dynamic scheduling algorithm is introduced to solve this problem. It is not only applicable to EMAN, but also to other similar scheduling problems with independent tasks. Actual experiments show that through optimization, serial execution time of our implementation is 11.50% less than that of EMAN. Besides, thanks to the self-adaptive scheduling algorithm, our implementation produces much higher speedups than EMAN. Speedups of the most time-consuming classification component are close to linearity. Moreover, parallel efficiency of our implementation on 16 CPU cores is 29.8% higher, compared with the implementation of EMAN. Therefore, our implementation is capable of making full use of available computing resources, dramatically reducing the processing time of single particle reconstruction.