Abstract:
High-performance password recovery system is one of the important application scenarios of the Sunway many-core processor. Many popular password recovery systems and tools adopt rule processing as a mainstream password generation method due to its relatively high hit rate compared with the dictionary/mask based password generation methods. However, current researches lack optimization for the rule processing algorithm on the Sunway processor, which makes the rule-based password generation speed become the bottleneck of the password recovery systems. By analyzing the parallelism of the rule processing algorithm at different levels, we propose several optimization techniques for the rule processing on the Sunway processor. For the thread-level optimizations, we explore the optimal scheme to parallelize the rule processing algorithm, which includes the optimal task mapping technique, the optimal local data memory allocation technique, the load balancing technique, and the variable-length rule storage technique. For the data-level optimizations, we analyze the computing patterns of the rule functions and leverage the Sunway SIMD instructions to vectorize the rule functions and reduce the execution time. The experimental results based on the SW26010 processor show that the proposed optimization techniques effectively eliminate the performance bottleneck of rule processing and the rule-based password recovery speed is increased by 30 to 101 times.