高级检索
    张振东, 王彤, 刘鹏. 面向申威众核处理器的规则处理优化技术[J]. 计算机研究与发展, 2024, 61(1): 66-85. DOI: 10.7544/issn1000-1239.202220656
    引用本文: 张振东, 王彤, 刘鹏. 面向申威众核处理器的规则处理优化技术[J]. 计算机研究与发展, 2024, 61(1): 66-85. DOI: 10.7544/issn1000-1239.202220656
    Zhang Zhendong, Wang Tong, Liu Peng. Rule Processing Optimization Technologies on the Sunway Many-Core Processor[J]. Journal of Computer Research and Development, 2024, 61(1): 66-85. DOI: 10.7544/issn1000-1239.202220656
    Citation: Zhang Zhendong, Wang Tong, Liu Peng. Rule Processing Optimization Technologies on the Sunway Many-Core Processor[J]. Journal of Computer Research and Development, 2024, 61(1): 66-85. DOI: 10.7544/issn1000-1239.202220656

    面向申威众核处理器的规则处理优化技术

    Rule Processing Optimization Technologies on the Sunway Many-Core Processor

    • 摘要: 高性能口令恢复系统是申威众核处理器的重要应用场景之一,规则处理是主流口令恢复工具中被广泛应用的一种口令生成方式. 现有相关研究工作缺少对规则处理算法的优化,导致申威处理器上基于规则的口令生成速度成为口令恢复系统的性能瓶颈. 通过分析规则处理算法的多层次可并行性,提出了面向申威众核处理器的线程级、数据级优化方案. 在线程级优化方案中,探索了规则处理算法的最优任务映射方式,设计了主从核任务分配机制、从核缓冲区配比优化机制、负载均衡机制、变长规则存储机制等技术以提高并行效率;在数据级优化方案中,分析了规则处理算法中规则函数的计算模式,并通过申威SIMD指令集对规则函数进行向量优化以提高执行效率. 在SW26010处理器上的实验结果表明,上述优化方案有效解除了规则处理的性能瓶颈,使规则模式下的口令恢复速度提升了30~101倍.

       

      Abstract: High-performance password recovery system is one of the important application scenarios of the Sunway many-core processor. Many popular password recovery systems and tools adopt rule processing as a mainstream password generation method due to its relatively high hit rate compared with the dictionary/mask based password generation methods. However, current researches lack optimization for the rule processing algorithm on the Sunway processor, which makes the rule-based password generation speed become the bottleneck of the password recovery systems. By analyzing the parallelism of the rule processing algorithm at different levels, we propose several optimization techniques for the rule processing on the Sunway processor. For the thread-level optimizations, we explore the optimal scheme to parallelize the rule processing algorithm, which includes the optimal task mapping technique, the optimal local data memory allocation technique, the load balancing technique, and the variable-length rule storage technique. For the data-level optimizations, we analyze the computing patterns of the rule functions and leverage the Sunway SIMD instructions to vectorize the rule functions and reduce the execution time. The experimental results based on the SW26010 processor show that the proposed optimization techniques effectively eliminate the performance bottleneck of rule processing and the rule-based password recovery speed is increased by 30 to 101 times.

       

    /

    返回文章
    返回