ISSN 1000-1239 CN 11-1777/TP

Table of Content

01 October 2014, Volume 51 Issue 10
Stock Network Community Detection Method Based on Influence Calculating Model
Wang Hao, Li Guohuan, Yao Hongliang, Li Junzhao
2014, 51(10):  2137-2147.  doi:10.7544/issn1000-1239.2014.20130575
Asbtract ( 1118 )   HTML ( 6)   PDF (3150KB) ( 4110 )  
Related Articles | Metrics
Taking advantage of the energy characteristics of complex system, a concept of influence is introduced to research community detection method, so that community structure could be discovered effectively. With regard to the stock closing price, by introducing the definition of influence and node centrality, a stock network is construted with influence which is regarded as the edge weight. This paper proposes an algorithm named stock network hierarchical clustering based on the influence calculating model, which is referred to as BCNHC algorithm. Firstly, BCNHC algorithm introduces the definition of nodes’ activity and influence, and puts forward the influence calculating model of node in networks in addition. Then, on the basis of measure criterion of the node centrality, the nodes with large node centrality value as the center nodes are selected, and the nodes’ Intimacy and influence model are utilized to ensure the influence of association between neighbor nodes. Furthermore, the node with minimum degree is gathering toward to center nodes, so as to reduce the error clustering caused by the uncertainty of which community neighbor nodes belong to. On the basis, the neighbor communities are clustered with the average influence of association of communities. It guarantees that influence of association reach to maximization for all the nodes in the community, until the entire networks’ modularity come to maximum. At last, comparison and analysis of experimental on stock network prove the feasibility of BCNHC algorithm.
A Pattern Class Mining Model Based on Active Learning
Guo Husheng, Wang Wenjian
2014, 51(10):  2148-2159.  doi:10.7544/issn1000-1239.2014.20130572
Asbtract ( 696 )   HTML ( 0)   PDF (5558KB) ( 823 )  
Related Articles | Metrics
In practical applications, there are a lot of data mining problems with unknown class information for the diversity, fuzziness and complexity of objective world. However, traditional methods are generally based on the categories of data which is known before mining, while there are no effective methods for solving this kind of problems. To solve this kind of problems, this paper presents a pattern class mining model based on active learning, namely PM_AL. Firstly, by the difference measurements between the unlabeled samples and labeled samples, some samples are selected as the most valuable samples according to active learning technique. Then these valuable samples are labeled by experts, and the model quickly mines pattern classes implicated in unlabeled samples. Hence, the most valuable samples will be extracted, and the model can quickly mine pattern categories implicated in unlabeled samples. Therefore, a non-labeling multi-class problem can be transferred into a labeling multi-class problem with the very low labeling cost. Through active learning during initial classes mining, the proposed PM_AL model can obtain high learning efficiency, low labeling cost and good generalization performance. The experiment results demonstrate that PM_AL model can effectively find categories as many as possible and solve the large scale multiple classification problems with unknown categories.
Online Counterfactual Regret Minimization in Repeated Imperfect Information Extensive Games
Hu Yujing, Gao Yang, An Bo
2014, 51(10):  2160-2170.  doi:10.7544/issn1000-1239.2014.20130823
Asbtract ( 1249 )   HTML ( 2)   PDF (1915KB) ( 956 )  
Related Articles | Metrics
In this paper, we consider the problem of exploiting suboptimal opponents in imperfect information extensive games. Most previous works use opponent modeling and find a best response to exploit the opponent. However, a potential drawback of such approach is that the best response may not be a real one, since the modeled strategy actually may not be the same as what the opponent plays. We try to solve this problem from the perspective of online regret minimization, which avoids opponent modeling. We make extensions to a state-of-the-art equilibrium-computing algorithm called counterfactual regret minimization (CFR). The core problem is how to compute the counterfactual values in online scenarios. We propose to learn approximations of these values from the results produced by the game and introduce two different estimators: static estimator which learns the values directly from the results’ distribution, and dynamic estimator which assigns larger weight to new sampled results than older ones for better adapting to dynamic opponents. Two algorithms for online regret minimization are proposed based on the two estimators. We also give the conditions under which the values estimated by our estimators are equal to the true values, showing the relationship between CFR and our algorithms. Experimental results in one-card poker show that our algorithms not only perform the best when exploiting some weak opponents, but also outperform some state-of-the-art algorithms by achieving the highest win rate in matches with a few hands.
Approximate Gaussian Kernel for Large-Scale SVM
Liu Yong, Jiang Shali, Liao Shizhong
2014, 51(10):  2171-2177.  doi:10.7544/issn1000-1239.2014.20130825
Asbtract ( 941 )   HTML ( 0)   PDF (863KB) ( 681 )  
Related Articles | Metrics
Training support vector machine (SVM) with nonlinear kernel functions on large-scale data is usually very time consuming. In contrast, there exist faster solvers to train the linear SVM. To utilize the computational efficiency of linear SVM without sacrificing the accuracy of nonlinear ones, in this paper, we present a method for solving large-scale nonlinear SVM based on an explicit description of an approximate Gaussian kernel. We first give the definition of the approximate Gaussian kernel, and establish the connection between approximate Gaussian kernel and Gaussian kernel, and also derive the error bound between these two kernel functions. Then, we present an explicit description of the reproducing kernel Hilbert space (RKHS) induced by the approximate Gaussian kernel. Thus, we can exactly depict the structure of the solutions of SVM, which can enhance the interpretability of the model and make us more deeply understand this method. Finally, we explicitly construct the feature mapping induced by the approximate Gaussian kernel, and use the mapped data as input of linear SVM. In this way, we can utilize existing efficient linear SVM to solve non-linear SVM on large-scale data. Experimental results show that the proposed method is efficient, and can achieve comparable classification accuracy to a normal nonlinear SVM.
On Protein Complexes Identifying Algorithm Based on the Novel Modularity Function
Guo Maozu, Dai Qiguo, Xu Liqiu, Liu Xiaoyan
2014, 51(10):  2178-2186.  doi:10.7544/issn1000-1239.2014.20130538
Asbtract ( 760 )   HTML ( 1)   PDF (1899KB) ( 568 )  
Related Articles | Metrics
Proteins often interact with each other to form complexes. It is very significant for understanding the activities in cell to carry out their biological functions. In recent years, with the rapid development of new biological experiment technologies, a large amount of protein-protein interaction (PPI) networks are generated. Identifying protein complexes by clustering proteins in PPI networks is hot spot in current bioinformatics research. Many clustering methods, which are mainly based on graph partition or the technologies of community detection in social network, have been proposed to recognize the protein complexes in PPI networks in last decade. However, the performances of most of previous developed detecting methods are not ideal. They cannot identify the overlapping complexes, but according to the biological study found, protein complexes are often overlapping. Therefore, in this paper, a protein complexes modularity function (Q function), namely PQ function, is proposed to identify the overlapping complexes from PPI networks. Based on PQ, a new algorithm for identifying protein complexes BMM (the algorithm based on protein complexes modularity function for merging modules). Firstly, BMM algorithm finds some dense sub-graphs as initial modules. Then, these initial modules are merged by maximizing the modularity function PQ. Finally, several high-quality protein complexes are found. Comparing these protein complexes with two known protein complexes datasets, the results suggest that the performance of BMM is excellent. In addition, compared with other latest algorithms, BMM is more accurate.
Hybrid Differential Evolution Gravitation Search Algorithm Based on Threshold Statistical Learning
Zhang Yingjie, Gong Zhonghan
2014, 51(10):  2187-2194.  doi:10.7544/issn1000-1239.2014.20130395
Asbtract ( 753 )   HTML ( 0)   PDF (1609KB) ( 646 )  
Related Articles | Metrics
Differential evolution (DE) is a simple and efficient population-based stochastic real-parameter optimization algorithm, which has been applied to a wide range of complex optimization problems. However, the standard DE has some drawbacks, such as premature convergence, low convergence precision and slow convergence rate in the later stage of evolution. To deal with these drawbacks, a novel hybrid differential evolution algorithm (DEGSA-SL) is proposed by combining the advantage of gravitational search algorithm (GSA). In the proposed algorithm, a new operator called threshold statistical learning is designed. By introducing the operator, the better strategy of DE and GSA can be selected adaptively, by learning from the previous success ratio of the two strategies, to produce next generation at each iteration in the evolution process. It takes full use of the potential of DE and GSA, ensures the balance between global exploration and local exploitation abilities in the solution spaces, and improves the global search capabilities of the standard DE algorithm. Several complex benchmark functions are employed to test the performance of the DEGSA-SL. The results show that the proposed algorithm not only achieves better convergence precision, robustness and convergence rate, but also avoids the premature convergence problem effectively.
An Effective Differential Privacy Transaction Data Publication Strategy
Ouyang Jia, Yin Jian, Liu Shaopeng, Liu Yubao
2014, 51(10):  2195-2205.  doi:10.7544/issn1000-1239.2014.20130824
Asbtract ( 845 )   HTML ( 5)   PDF (2007KB) ( 842 )  
Related Articles | Metrics
For the past few years, privacy preserving data publishing which can securely publish data for analysis purpose has attracted considerable research interests in database community. However, the sparsity of the transaction data burdens the trade-off between privacy protection and enough utility maintaining. Most existing data publishing methods for transaction data are based on partition-based anonymity models, for example k-anonymity. They depend on background knowledge from the attack, and the published data cannot meet the needs of the analysis tasks. In contrast, differential privacy is a strong privacy model which provides strong privacy guarantees independent of an adversary’s background knowledge and also maintains high utility for the published data. Because most existing methods and privacy models cannot accommodate both utility and privacy security of the data, in this paper, an application-oriented TDPS(transaction data publish strategy) is proposed, which is based on differential privacy and compressive sensing. Firstly, an entire Trie tree is constructed for a transaction database. Secondly, based on compressive sensing, we get a noisy Trie tree by adding the differential privacy noisy to the Trie tree. Finally, the frequent itemset mining task is performed on the noisy Trie tree. Theoretical analysis and experimental results demonstrate that the TDPS can preserve privacy of the sensitive data well, meanwhile maintain better data utility.
An Algorithm for Top-k Query Refinement Based on User’s Feedback
Zhang Jianfeng, Han Weihong, Fan Hua, Zou Peng, Jia Yan
2014, 51(10):  2206-2215.  doi:10.7544/issn1000-1239.2014.20130827
Asbtract ( 880 )   HTML ( 0)   PDF (1891KB) ( 642 )  
Related Articles | Metrics
Top-k queries are mainly used to return the most important objects in the potentially huge answer space. After huge effort working on top-k query performance, an explain capability has attract more attentions in recent years. In top-k query, since users can not precisely specify their own preferences, he/she may feel frustrated with the top-k results and propose a question such that “why my expecting tuple m is not appeared in top-k results as long as tuple p has been appeared in top-k results”. To solve this problem, a novel method is proposed to answer user’s why-not question by modifying original top-k query. Firstly, an assessment model is defined to evaluate the difference between the original top-k query and the modified new query. Then based on the assessment model, a sample method is used to sample the candidate weight vectors from an accuracy subspace of weight space. After that, for each candidate weight vector, a progressive top-k algorithm is used to get the optimal new query, and the terminal conditions of the progressive top-k algorithm are improved by the assessment model. Furthermore, the new query with the best evaluated score is returned as the solution. Finally, an extensive performance study using synthetic data set is reported to verify the efficiency of the proposed algorithm.
Exploration of Weighted Proximity Measure in Information Retrieval
Xue Yuanhai, Yu Xiaoming, Liu Yue, Guan Feng, Cheng Xueqi
2014, 51(10):  2216-2224.  doi:10.7544/issn1000-1239.2014.20130339
Asbtract ( 946 )   HTML ( 0)   PDF (4508KB) ( 885 )  
Related Articles | Metrics
A key problem of information retrieval is to provide information takers with relevant, accurate and even complete information. Lots of traditional information retrieval models are based on the bag-of-words assumption, without considering the implied associations among the query terms. Although term proximity has been widely used for boosting the performance of the classical information retrieval models, most of those efforts do not fully consider the different importance between the query terms. For queries in modern information retrieval, the query terms are not only dependent of each other, but also different in importance. Thus, computing the term proximity with taking into account the different importance of terms will be helpful to improve the retrieval performance. In order to achieve this, a weighted term proximity measure method is introduced, which distinguishes the significance of the query terms based on the collections to be searched. Weighted proximity BM25 model(WP-BM25) that integrating this method into the Okapi BM25 model is proposed to rank the retrieved documents. A large number of experiments are conducted on three standard TREC collections which are FR88-89, WT2G and WT10G. The results show that the weighted proximity BM25 model can significantly improve the retrieval performance, and it has good robustness.
TTRank: User Influence Rank Based on Tendency Transformation
Duan Songqing, Wu Bin, Wang Bai
2014, 51(10):  2225-2238.  doi:10.7544/issn1000-1239.2014.20131570
Asbtract ( 721 )   HTML ( 0)   PDF (4441KB) ( 680 )  
Related Articles | Metrics
In recent years, many scholars have studied the problem of how to analyze user influence from the perspective of reply relationships. But there are various problems, such as reply relations are scarce, the content of posts are ignored, don’t support dynamic updates and so on. To compensate for these deficiencies, a user influence analysis method based on tendency transformation is proposed. Firstly, the post influence is calculated, the concept of “local reply chain” (LRC) is proposed, and the calculation method of reply indirect relationships is introduced, so the number of reply relationship is drastically increased. Secondly, the process of user tendency transformation is analyzed based on LRC, and the degrees of user impacting on others and being affected by others are gotten. Finally, the user influence ranking within the specified range is obtained. Compared with 10 kinds of classical influence analysis algorithms and used to analyze realistic data, the algorithm can characterize the user image better from the other point of view.
Indexing Page Collection Selection Method for Search Engine
Ru Liyun, Li Zhichao, Ma Shaoping
2014, 51(10):  2239-2247.  doi:10.7544/issn1000-1239.2014.20130340
Asbtract ( 703 )   HTML ( 0)   PDF (1980KB) ( 758 )  
Related Articles | Metrics
With the rapid development of the Internet, the number of pages is growing explosively. This presents a huge challenge for search engines which provide Web page search services. There are also lots of similar or even the exact same content pages and low-quality pages. In term of search engine, indexing such pages is no significant effect for retrieval results, but increases the search engine indexing and retrieval burden. A page selection algorithm is proposed to build indexing page collection from massive Web data for search engine. One hand, signature-based cluster algorithm is used to filter the similar pages to compress the size of the indexing page collection; on the other hand it combines a variety of features of the page dimensions and user dimensions, to ensure the quality of the collection. This algorithm is not only able to quickly cluster and select pages, but also achieve a higher compression ratio while still preserving the amount of information present in the indexing page collection. Experiments with actual page collections show that the size of indexing page collection selected by the proposed algorithm is about the entire page collection by 1/3, and can meet the vast majority of user click needs, with a strong practical.
Markov Network Retrieval Model Based on Document Cliques
Tang Wanning, Wang Mingwen, Wan Jianyi
2014, 51(10):  2248-2254.  doi:10.7544/issn1000-1239.2014.20130343
Asbtract ( 733 )   HTML ( 0)   PDF (1345KB) ( 597 )  
Related Articles | Metrics
The query expansion is an effective way to improve the efficiency of information retrieval. But many of the query expansion methods to select the expansion terms did not take fully account of the correlation between the terms as well as terms and documents, which may reduce retrieval performance.Due to the information of the correlation between terms and documents is able to improve the efficiency of retrieval, this paper calculates the correlation between documents and terms, and mapping terms to documents to build a Markov network retrieval model; and then extracts term clique according to the mapping information. The mappting information is used to divide the term cliques into two categories. One is based on document and another is not based on document.The terms cliques based on document are more relevant with the query topic, so to the terms cliques are given greater weight based on document and the information of the two kinds of terms cliques is used to assist retrieval. Therefore, the method we propose in this paper can make the extension content more relevant to query. Experimental results show the proposed model can improve the retrieval efficiency.
A Collaborative Filtering Recommendation Mechanism for Cloud Computing
Zhu Xia, Song Aibo, Dong Fang, Luo Junzhou
2014, 51(10):  2255-2269.  doi:10.7544/issn1000-1239.2014.20130056
Asbtract ( 883 )   HTML ( 3)   PDF (5531KB) ( 1046 )  
Related Articles | Metrics
The advent of cloud computing has witnessed a sharp augmentation in the amount of application data, the result of which gives more and more prominence to a personalized recommendation technique, which has been used by different kinds of Web applications. However, traditional collaborative filtering (CF) techniques, when applied to cloud computing which features a huge scale and distributed processing architecture, are confronted with some problems, such as low recommendation accuracy, long recommendation time and high network traffic. Consequently, the performance of recommendation mechanisms degrades drastically. To address this issue, a CF recommendation mechanism for cloud computing (RAC) is proposed in this paper. RAC first employs a candidate neighbor, the definition of which will also be given, to screen those items which exert great influence on the recommendation results. Then a 2-stage rating indexing structure based on distributed storage system will be constructed to ensure the quick and precise location of CN by RAC. On this basis, a CN-based CF recommendation algorithm CN-DCFA which searches k nearest neighbors between candidate neighbors can be provided. Meanwhile, CN-DCFA utilizes similarity to predict the recommendation results top-N of active users. Experiments have showed that our recommendation mechanism RAC, enjoying great efficiency and exactitude, is worthy of recommendation.
Mining Latent Semantic on User-Tag-Item for Personalized Music Recommendation
Li Ruimin, Lin Hongfei, Yan Jun
2014, 51(10):  2270-2276.  doi:10.7544/issn1000-1239.2014.20130342
Asbtract ( 1182 )   HTML ( 13)   PDF (1201KB) ( 1367 )  
Related Articles | Metrics
Personalized recommender systems are confronting great challenges of accuracy, diversification and novelty, especially when the data set is sparse and lacks of accessorial information, such as user profiles, item attributes and explicit ratings. Collaborative tags contain abundant information about personalized preferences and item contents, and are therefore potential to help providing better recommendations. In this paper, we analyze the information on the famous music social network, Last.fm. Bipartite graph is established between users, items and tags while random walk with restart is used to analyze the relationship between the nodes discussed before and get the neighboring relations between songs or tags. After that, musicrecommended list and indirect related music collection, thus, can be obtained. At last, personalized music recommendation algorithm can be implemented by fusing and reranking the recommended list using the algorithm proposed in this paper. Experiments show that, in the same corpus, the music recommendation algorithmin this paper performs better than the ordinary method such as collaborative filtering and bipartite based algorithm. Our method built on Last.fm, therefore, satisfies the personalized requirement for users to music. Furthermore, with the development of Web2.0, our method will show its advantage as the amount of tags become more and more enormous.
MMCKDE: m-Mixed Clustering Kernel Density Estimation over Data Streams
Xu Min, Deng Zhaohong, Wang Shitong, Shi Yingzhong
2014, 51(10):  2277-2294.  doi:10.7544/issn1000-1239.2014.20130718
Asbtract ( 808 )   HTML ( 0)   PDF (5450KB) ( 660 )  
Related Articles | Metrics
In many data stream mining applications, traditional density estimation methods such as kernel density estimation and reduced set density estimation can not apply to the data stream density estimation because of their high computational burden and big storage space. In order to reduce the time and space complexities, a novel online data stream density estimation method by m-mixed clustering kernel is proposed. In the proposed method, MMCKDE nodes are created using a fixed number of mixed clustering kernels to get cluster information instead of all kernels obtained from other density estimation method. In order to further reduce the storage space, MMCKDE nodes can be merged by calculating KL divergence. Finally, the probability density functions over arbitrary time or the entire time can be estimated by the obtained model. We compared the MMCKDE algorithm with the SOMKE algorithm in terms of density estimation accuracy and running time for various stationary data sets. We also investigated the use of MMCKDE over evolving data streams. The experimental results illustrate the effectiveness and efficiency of the proposed method.
Structured Sparse Linear Discriminant Analysis
Cui Zhen, Shan Shiguang, Chen Xilin
2014, 51(10):  2295-2301.  doi:10.7544/issn1000-1239.2014.20130188
Asbtract ( 1231 )   HTML ( 3)   PDF (1803KB) ( 802 )  
Related Articles | Metrics
Linear discriminant analysis (LDA) is a very efficient image feature extraction technique in the supervised scenario. However, LDA often leads to over-fitting when using small scale training samples, and simultaneously might not show an intuitive explanation for the learnt projections from the view of human cognition. To handle these problems, especially for the discovery of those interpretability structures, a called structured sparse LDA (SSLDA) method is proposed by employing the linear regression model of LDA and the structured sparse L\-{2,1} mixed norm. Furthermore, to remove the correlations of the learnt linear transforms, the orthogonalized SSLDA (OSSLDA) is also proposed to learn more subtle textural structure information from face images. To solve both two proposed models: SSLDA and OSSLDA, we further introduce a simply and efficient half-quadratic optimization algorithm, which incorporates an auxiliary variable into the objective function and then alternately optimizes between the projecting variable and the auxiliary variable. To evaluate our proposed method, SSLDA and OSSLDA, we conduct extensive experiments on three public face datasets, AR, Extended Yale B and MultiPIE, for the face recognition task by comparing LDA and its several classical variants. The experimental results show the benefits of the proposed methods on both classification accuracy and interpretability.
Dynamic Spatial Synthesis of Dorsal Hand Vein Images Based on PCA
Wang Yiding, Jiang Nan, Li Kefeng
2014, 51(10):  2302-2307.  doi:10.7544/issn1000-1239.2014.20130822
Asbtract ( 610 )   HTML ( 1)   PDF (1906KB) ( 602 )  
Related Articles | Metrics
Databases of human dorsal hand vein (DHV) images are created and distributed for the purpose of testing DHV identification algorithm. For logistical and privacy reasons, these databases are often too small to fulfill their potential applications. A novel synthesis method of creating new DHV images using principal component analysis (PCA) is proposed, which can be applied to enlarge the existing DHV image database. Firstly, the origin database is divided into set B and set M. Set B provides a feature space and the set M constitutes a sample space used for projection. Then, the projection coefficients are obtained by projecting the sample space to the feature space. Finally, based on the method of PCA, new data are synthesized using the feature space and the projection coefficients extracted from the existing real data. Dynamic changing of the samples in the two sets can synthesize more and more new samples. The number of the samples in the two sets is also decided by our experiments. Therefore, a new synthesized DHV image database containing 8007 subjects is built in our research with 94 original subjects. The experimental results show that the synthesized database reaches a satisfied recognition rate of 97.84%, which indicates the proposed method performs well and would be applicable in the simulation test.
A Cloud User Behavior Authentication Model Based on Multi-Partite Graphs
Tian Junfeng, Cao Xun
2014, 51(10):  2308-2317.  doi:10.7544/issn1000-1239.2014.20130619
Asbtract ( 678 )   HTML ( 0)   PDF (3185KB) ( 681 )  
Related Articles | Metrics
Cloud computing is developing rapidly, and the trustiness of cloud platform is the key issue relating to its success or failure. The authentication of the trustiness of user behavior is an important part of ensuring the credibility of cloud platform. In order to solve the problem of trustiness of cloud users’ behaviors, a cloud user behavior authentication model based on multi-partite graphs (BAM) is proposed. It includes the layer of user behavior evidence, the layer of building behavior multi-partite graphs and the layer of behavior authentication. The behavior evidence is the basis, the multi-partite graphs is the method and the behavior authentication is the purpose. In the layer of user behavior evidence, the model determines the type of evidence, collects behavior evidences and analyzes user behavior quantitatively; in the layer of building behavior multi-partite graphs, the model builds two multi-partite graphs based on the layer of behavior evidence and the knowledge of graph theory; in the layer of behavior authentication, the model builds the cloud user behavior authentication module to verify that users are trusted. Identity re-certification and risk game are introduced to enhance security and accuracy of the model. The analysis of small-scale cloud user behaviors in simulation experiments show that, the model is accurate and effective in measuring the normal behavior of cloud users and in distinguishing malicious user with the risk user, and it has higher detection ratio and lower false positive ratio.
Network Coding-Based Privacy Protection Scheme for Mobile P2P Networks
Li Zhiyuan, Bi Junlei, Wang Ruchuan
2014, 51(10):  2318-2328.  doi:10.7544/issn1000-1239.2014.20120948
Asbtract ( 768 )   HTML ( 0)   PDF (2989KB) ( 1504 )  
Related Articles | Metrics
With the rapid development of mobile peer to peer (MP2P) applications, user privacy requirements have become increasingly urgent. However, in distributed, frequent mobility and decentralized MP2P environments, the existing schemes have various security vulnerabilities. To protect user privacy in MP2P environments, a mutual anonymity node privacy protection scheme (NMA) based on network coding is proposed. Our contributions are described as below. We first design a network coding scheme which can defend against various omniscient adversary attacks. Then, the network coding is used in the MP2P file-sharing application, including the resource searching, the resource requesting, the resource responding and the file download, to protect user’s identity, user’s location and routing information. The advantages of the scheme lie in the fact that the network coding and mutli-agent can improve the network load balance, the successful rate of information transmission and the anonymity degree. Both theoretical analysis and the experimental results demonstrate that when the percentage of malicious peers is lower than 50%, the NMA scheme not only can protect the efficient information transmission, but also can hide the user’s identity and other privacy information.
Biclique Cryptanalysis of Block Cipher SHACAL2
Zheng Yafei, Wei Hongru
2014, 51(10):  2329-2335.  doi:10.7544/issn1000-1239.2014.20130639
Asbtract ( 963 )   HTML ( 1)   PDF (2611KB) ( 682 )  
Related Articles | Metrics
SHACAL2 is a block cipher designed by Handschuh H. et al based on the standard Hash function SHA2 in 2002. It one of the European standard block ciphers, and has relatively high security because of its long block length and key length, which are 256b and 512b respectively. There have been a few security analysis results about SHACAL2, such as impossible differential cryptanalysis and related-key rectangle attack on reduced rounds of SHACAL2. Taking advantage of the characteristics of the key schedule and the permutation layer of block cipher SHACAL2, 18-round 32-dimensional Biclique of the first eight rounds of SHACAL2 is constructed. Based on the Biclique constructed, Biclique attack is applied to the whole 64-round SHACAL2. And the results show that, using Biclique attack to recover the whole 512b key information of 64-round SHACAL2, the data complexity is no more than 2\+{224} chosen plaintexts, and the time complexity is 2\+{511.18} 64-round encryptions. Compared with the known analysis results, the data complexity of Biclique attack decreased obviously, and the time complexity is better than exhaustive search. For whole round SHACAL2,Biclique attack is a relatively effective method. This is the first single-key attack for whole round SHACAL2.
A Pattern Translation Method for Flags in Binary Translation
Wang Wenwen, Wu Chenggang, Bai Tongxin, Wang Zhenjiang, Yuan Xiang, Cui Huimin
2014, 51(10):  2336-2347.  doi:10.7544/issn1000-1239.2014.20130018
Asbtract ( 870 )   HTML ( 5)   PDF (2995KB) ( 629 )  
Related Articles | Metrics
Binary translation is an important way for software migration between different hardware platforms. For binary translation systems, how to simulate the flag register on target platform that does not have the function of the flag register, is a key factor affecting system performance. Previous flag analysis techniques delete the unnecessary flag computing via the flow analysis of flag definition and usage during the process of flag translation. However, for the necessary flag computing, these techniques still have to translate for simulation, which can lead to the expanse of generated native code and the loss of system performance. A pattern translation method is presented for necessary flag computing in binary translation system. Based on previous flag analysis techniques, this method makes flag definition instruction and flag usage instruction a flag pattern, and chooses appropriate instruction groups on target platform to translate flag computing based on the semantics of the flag pattern. This method for flag translation can reduce the expanse of generated native code and improve the performance of binary translation system. The experiment results demonstrate that, for the programs in SPEC CINT2006, this method can reduce the amount of generated native code by 7.5% and improve the system performance by 10% on average.
Interception and Identification of Guest OS Non-trapping System Call Instruction within VMM
Xiong Haiquan, Liu Zhiyong, Xu Weizhi, Tang Shibin, Fan Dongrui
2014, 51(10):  2348-2359.  doi:10.7544/issn1000-1239.2014.20130612
Asbtract ( 1240 )   HTML ( 1)   PDF (3558KB) ( 744 )  
Related Articles | Metrics
To solve the problem that VMM can not monitor and control some Guest OS specific behavior due to its non-trapping feature in virtualized computing environment, an idea has been proposed to make those non-trapping instructions trap into VMM through modifying their normal execution conditions so as to cause system exception. According to the idea, special methods have been explored on how to intercept and identify the three different non-trapping system call instructions of x86 architecture from Guest OS within VMM. The int and sysenter instructions trap into VMM through causing GP system exception, while syscall instruction trap into VMM through causing UD system exception. They are identified with the virtual CPU context information within VMM. The Qemu&Kvm based prototype indicates that VMM can successfully intercept and identify all the three system call behaviors from Guest OS, and the performance overhead is within an accepted range for normal applications. For example, in unixbench shell test case, the performance overhead ratio is range 1900 to 2608. Compared with existing methods, they are all based on the architecture specification, so the advantage is that they are transparent to Guest OS and need not any modifications to Guest OS.