高级检索

    基于并行字符索引的多步长正则表达式匹配算法

    Multi-Stride Regular Expression Matching Using Parallel Character Index

    • 摘要: 深度包检测(deep packet inspection, DPI)是网络入侵检测与防御系统(network intrusion dete-ction and prevention system, NIDPS)的核心.基于三态内容可寻址存储器(ternary content addressable memory, TCAM)的正则表达式匹配算法提高了数据包的处理速度,成为DPI技术的一个重要研究方向.TCAM具有查找速度快、存储空间小等特性,且能耗与存储空间成正比.由于DFA的存储空间开销比较大,且存储空间大小随着DFA步长数的增加而指数倍增,基于TCAM的DFA面临高能耗的问题,特别是多步长DFA.提出一种基于并行字符索引的多步长正则表达式匹配算法(multi-stride parallel character-indexed DFA, PCIDFA),对确定型有限自动机(deterministic finite automaton, DFA)构造并行字符索引,通过比特位图取交集,减少匹配时激活的TCAM块数,显著降低TCAM能耗.实验结果表明:与多步长DFA相比,多步长PCIDFA在TCAM能耗上减少了99.8%以上,在TCAM存储空间开销上减少了48.5%~65.3%,在吞吐量上提高了1.9~2.6倍.

       

      Abstract: Deep packet inspection (DPI) is a key function of network intrusion detection and prevention systems (NIDPS). TCAM-based regular expression matching algorithms have been proposed as a promising approach to improve processing speed, which is an important research direction of DPI. Ternary content addressable memory (TCAM) has the characters of high searching speed and small storage space, as well as the TCAM power consumption is proportionate to its storage space. Deterministic finite automaton (DFA) requires large storage space and the storage space of multi-stride DFA grows exponentially with the stride of DFA, which leads to high TCAM power consumption of DFA, especially for multi-stride DFA. This paper presents a parallel character-indexed multi-stride regular expression matching algorithm to address such limitation. This algorithm uses the idea of building parallel character indexes according to the stride of DFA, and reduces the number of activated TCAM blocks by using bitmap intersection, which in turn translates low TCAM power. Experimental results show that our algorithm reduces the TCAM power by more than 99.8% as well as the TCAM space usage by 48.5%~65.3%, and improves the matching throughput by 1.9~2.6 times compared with previous solutions based on multi-stride DFA.

       

    /

    返回文章
    返回