Abstract:
Unknown malicious code sample automatic detection scheme is proposed based on space relevance features. According to the characteristics quantitative vectors of character space, malicious code samples are divided into space relevance blocks based on the intelligence region growing segmentation algorithm. In each block of malicious code sample, the spatial relations of character moment, information entropy, and correlation coefficient are calculated, the feature vectors are extracted, and the normalization processes are manipulated. Then, the reference of spatial relational feature vectors have been set up through the analysis of general spatial properties of malicious code samples. In order to match the previous unknown malicious codes, the similarity preferred matching algorithm which is based on comprehensive analysis of multiple features is adopted. In addition, the spatial relational distances are weighted and considered together, so as to improve the accuracy of the search work. Experimental flow graph is designed, spatial relational feature vectors properties of multiple malicious code sample blocks are portrayed, and the comparisons of malicious code detection accuracy rate between single feature match method and comprehensive multiple features match method are drawn. Experiments result analyses show that the proposed automatic detection scheme can match the previous unknown malicious code with high accurate degree and can determine the corresponding subordinate type of malicious code samples.