ISSN 1000-1239 CN 11-1777/TP

Table of Content

01 August 2015, Volume 52 Issue 8
Online Learning Algorithms for Big Data Analytics: A Survey
Li Zhijie,Li Yuanxiang,Wang Feng,He Guoliang,Kuang Li
2015, 52(8):  1707-1721.  doi:10.7544/issn1000-1239.2015.20150185
Asbtract ( 4957 )   HTML ( 108)   PDF (1700KB) ( 3902 )  
Related Articles | Metrics
The advent of big data has been presenting a large array of applications that require real-time processing of massive data with high velocity. How to mine big data stream in a wide range of real-world applications becomes more and more important. Conventional batch machine learning techniques suffer from many limitations when being applied to big data analytics tasks. Online learning technique with stream computing mode is a promising tool for data stream learning. In this survey, we firstly introduce the motivation and background of big data analytics, and then focus on presenting the family of classical and latest online learning methods and algorithms, which are promising to tackle the emerging challenges of mining big data in a wide range of real-world applications. The main technical content of this survey consists of three parts: 1) online learning for linear model;2) kernel-based online learning for nonlinear model;3) non-traditional online learning methods. This is followed by a discussion about some key problems of large-scale machine learning for big data analytics applications. Finally, we present a few typical scenarios of online learning for big data stream and discuss possible directions for ongoing and future research in this area.
Generalized Kernel Polarization Criterion for Optimizing Gaussian Kernel
Tian Meng, Wang Wenjian
2015, 52(8):  1722-1734.  doi:10.7544/issn1000-1239.2015.20150110
Asbtract ( 1398 )   HTML ( 0)   PDF (3966KB) ( 708 )  
Related Articles | Metrics
The choice of kernel function is a basic and challenging problem in researches on kernel methods. Gaussian kernel is a popular and widely used one in various kernel methods, and many universal kernel selection methods have been derived for Gaussian kernel. However, these methods may have some disadvantages, such as heavy computational complexity, the difficulty of algorithm implement, and the requirement of the classes generated from underlying multivariate normal distributions. To remedy these problems, generalized kernel polarization criterion has been proposed to tune the parameter of Gaussian kernel for classification tasks. By taking the within-class local structure into account and centering the kernel matrix, the criterion does better in maximizing the class separability in the feature space. And the final optimized kernel parameter leads to a substantial improvement in the performance. Furthermore, the criterion function can be proved to have a determined approximate global minimum point. This good characteristic, coupled with its independence of the actual learning machine, makes the optimal parameter easier to find by many algorithms. Besides this, local kernel polarization criterion function, a special case of generalized kernel polarization criterion function, can also be proved to have a determined approximate global minimum point. The extensions of generalized kernel polarization criterion and local kernel polarization criterion to the multiclass domain have been proposed. Experimental results show the effectiveness and efficiency of our proposed criteria.
Temporal Link Prediction Based on Dynamic Heterogeneous Information Network
Zhao Zeya,Jia Yantao,Wang Yuanzhuo,Jin Xiaolong,Cheng Xueqi
2015, 52(8):  1735-1741.  doi:10.7544/issn1000-1239.2015.20150183
Asbtract ( 1565 )   HTML ( 8)   PDF (1251KB) ( 1082 )  
Related Articles | Metrics
Temporal link prediction on dynamic heterogeneous information networks, aiming to predict both the building times of links and their types, has been widely studied in recent years. The dynamic heterogeneous information network is a network that has different types of vertices and time-labeled edges, and in this paper we study the temporal link prediction problem in the dynamic heterogeneous information network. Most existing studies employs the structure-based predictive methods, where the structures fails to embed the time information. Therefore, they cannot characterize the correlation between structures and time during the prediction. In this work, we firstly construct the structure called the time-difference-labeled path(TDLP) to combine the time information and structural features into a unified setting and propose TDLP, a time-difference-labeled path based temporal link prediction method, which combines the time information with the structural path features. Experiments on a real data set of a scholar bibliographic website demonstrate that the proposed TDLP method performs better than the state-of-the-art methods on predicting both whether and when a link will be built.
Feature Selection Algorithm Based on the Multi-Colony Fairness Model
Yang Tan,Feng Xiang,Yu Huiqun
2015, 52(8):  1742-1756.  doi:10.7544/issn1000-1239.2015.20150245
Asbtract ( 1391 )   HTML ( 11)   PDF (3509KB) ( 877 )  
Related Articles | Metrics
As the world gradually transforms from the information world to the data-driven world, the areas of pattern recognition and date mining are facing more and more challenges. Feature subset selection process becomes a necessary part of big-data pattern recognition due to the data with explosive growth. Inspired by the behavior of grabbing resources in animals, the paper adds personal grabbing-resource behavior into the model of resource distribution transformed from the model of feature selection and proposes multi-colony fairness algorithm(MCFA) to deal with this behavior in order to obtain a better distribution scheme (i.e. to obtain a better feature subset). The algorithm effectively fuses the strategies of the random search and the heuristic search. In addition, it combines the methods of filter and wrapper so as to reduce the amount of calculation while improving the classification accuracy. The convergence and the effectiveness of the proposed algorithm are verified both from mathematical and experimental aspects. MCFA is compared with the other four classic feature selection algorithms SFS(sequential forward selection), SBS(sequential backward selection), SFFS(sequential floating forward selection), SBFS(sequential floating backward selection) and three mainstream feature selection algorithms RRFS(relevance-redundancy feature selection), mRMR(minimal-redundancy-maximal-relevance), ReliefF. The comparison results show that the proposed algorithm can obtain better feature subsets both in the aspects of feature subset length and the classification accuracy which indicates the efficiency and the effectiveness of the proposed algorithm.
Manifold Clustering and Visualization with Commute Time Distance
Shao Chao, Zhang Xiaojian
2015, 52(8):  1757-1767.  doi:10.7544/issn1000-1239.2015.20150247
Asbtract ( 1289 )   HTML ( 1)   PDF (5430KB) ( 931 )  
Related Articles | Metrics
The existing manifold learning algorithms can effectively learn and visualize the low-dimensional nonlinear manifold structure of high-dimensional data. However, most efforts to date select the neighborhood size in sensitivity and difficulty, and require sampling the data from a single manifold. To reduce the sensitivity of manifold learning algorithms to the neighborhood size, and address the effective visualization and clustering of multi-manifold data, this paper employs the commute time distance to propose a novel manifold learning algorithm, called CTD-ISOMAP (commute time distance isometric mapping). Compared with Euclidean distance, commute time distance probabilistically synthesizes all the paths connecting any two points in the neighborhood graph. Consequently, it takes into account the intrinsic nonlinear geometric structure for the given data, while still providing the robust results, and then is suitable to identify the shortcut edges and the inter-manifold edges possibly existed in the neighborhood graph. CTD-ISOMAP with the commute time distance, therefore, effectively eliminates the shortcut edges in the neighborhood graph, so that each output achieves the low-dimensional nonlinear manifold structure in the much wider range of the neighborhood size, and eliminates the inter-manifold edges in the neighborhood graph to boost the clustering on multi-manifold data obtained by spectral clustering. Finally, our experimental study verifies the effectiveness of CTD-ISOMAP.
FSMBUS: A Frequent Subgraph Mining Algorithm in Single Large-Scale Graph Using Spark
Yan Yuliang,Dong Yihong,He Xianmang,Wang Wei
2015, 52(8):  1768-1783.  doi:10.7544/issn1000-1239.2015.20150256
Asbtract ( 2551 )   HTML ( 10)   PDF (6675KB) ( 1363 )  
Related Articles | Metrics
Mining frequent subgraphs in a single large-scale graph is of huge demand with the rapid growth of the social networking. However, it is inefficient for the serial algorithms to mine frequent subgraphs in low support when mining for a single large-scale graph. Meanwhile, few existing distributed algorithms can’t support the growth pattern mining, and the Hadoop framework they worked is not suitable for iterative running. In this paper, a distributed algorithm named FSMBUS for mining frequent subgraph in a single large-scale graph under Spark framework is proposed. It constructs the parallel computing candidate subgraphs by suboptimal CAM Tree, which returns all the frequent subgraphs for given user-defined minimum support. Additionally, infrequent patterns’ test and searching order chosen is introduced to optimize the algorithm. Sorted-Greedy method is designed for data partition to balance the workload. Our experiments show that FSMBUS runs faster and more effective than the existing algorithms with real datasets,and even can run with the lower support threshold and the larger graph datasets as well. At the same time, FSMBUS runs 2~4 times faster on Spark framework than that on Hadoop framework.
A Graph Clustering Method for Detecting Protein Complexes
Wang Jie,Liang Jiye,Zheng Wenping
2015, 52(8):  1784-1793.  doi:10.7544/issn1000-1239.2015.20150180
Asbtract ( 1237 )   HTML ( 4)   PDF (1457KB) ( 793 )  
Related Articles | Metrics
Protein-protein interaction (PPI) networks are widely present in complex biological networks. The topological features of PPI networks play an important role in analyzing the functional modules in networks. Some graph clustering methods have been successfully used to complex networks to detect protein complexes in PPI networks. Traditional graph clustering algorithms in PPI analyzing methods primarily focus on hard clustering for a network, while, nowadays soft clustering algorithms to find overlapped clusters have become one of the hotspots of current research. Existing soft clustering algorithms pay less attention on small-scale non-dense clusters, while some small-scale non-dense clusters often have important biological meaning in PPI networks. A measuring method of the association strength of edges is developed based on node neighborhoods in networks, and then a soft clustering algorithm named flow-simulation graph clustering (F-GCL) on the basis of flow simulation is presented to detect complexes in a PPI network. Experiments show that the proposed soft clustering algorithm F-GCL can simultaneously find out overlapping clusters and small-scale non-dense clusters without improving the running time. Compared with MCODE(molecular complex detection), MCL(Markov clustering), RNSC(restricted neighborhood search clustering) and CPM(clique percolation method) algorithms on six Saccharomyces cerevisiae PPI networks, the algorithm F-GCL shows considerable or better performance on three evaluating indicators: F-measure, Accuracy and Separation.
Recognizing the Same Commodity Entities in Big Data
Hu Yahui,Li Shijun,Yu Wei,Yang Sha,Gan Lin,Wang Kai,Fang Qiqing
2015, 52(8):  1794-1805.  doi:10.7544/issn1000-1239.2015.20150252
Asbtract ( 1566 )   HTML ( 2)   PDF (1811KB) ( 1169 )  
Related Articles | Metrics
The recent blossom of big data and e-commerce has revolutionized our life by providing everyone with the ease and fun never before. How to identify the same commodity entities from these multi-source heterogeneous, fragmented, various and inconsistent e-commerce data for better business intelligence raises a very valuable and challenging topic. In this light, we analyze the characteristics of Web big data and collect the crawled original commodity information data from the different e-commerce platforms, which are the multi-source heterogeneous and mass scales of data. Then, we build an index model based on commodity’s attributes and values, and construct a global model map to record the commodity’s attribute and value, and form the unified model and high efficient commodity information for the next step. And we measure the similarity of the commodity’s identity on the multilayer hierarchical probabilistic model, including identifying the possible candidate commodity set, similarity filtering the candidate commodity set and similarity filtering based on the special items of candidate commodities set. Finally, we output the same commodity set in the inverted index list. We also evaluate our method on the datasets collected from Chinese three main-stream B2C e-commerce platforms with Hadoop framework. Experimental results show the accuracy and effectiveness of our method.
Sentiment Uncertainty Measure and Classification of Negative Sentences
Zhang Zhifei, Miao Duoqian, Nie Jianyun,Yue Xiaodong
2015, 52(8):  1806-1816.  doi:10.7544/issn1000-1239.2015.20150253
Asbtract ( 1496 )   HTML ( 1)   PDF (3554KB) ( 748 )  
Related Articles | Metrics
Sentiment classification is a powerful technology for social media big data analysis. It is of great importance to predict the sentiment polarity of a sentence, especially a negative sentence that is often used. The negation words and sentiment words play equally important roles in the sentiment classification of negative sentences. A negation word is important when it modifies a sentiment word; but it can also have sentimental implication on its own. The existing methods only consider the negation words when they modify sentiment words. In this paper, a unified classification model based on decision-theoretic rough sets is proposed to deal with the sentiment classification of negative sentences. First, the sentiment value of each clause in a sentence is calculated by several lexicons and the inter-sentence relations. A novel measure of sentiment uncertainty for a sentence is given based on Kullback-Leibler divergence. Then, the negative sentences are represented in terms of four features (initial polarity, sentiment uncertainty, successive punctuations, and sentence type) and especially two negation-related features: single negation and salient adverb. Finally, a novel attribute reduction algorithm based on the decision correlation degree is used to generate the decision rules for sentiment classification of negative sentences. The experimental results show that this model is effective and the sentiment uncertainty measure is helpful to sentiment classification.
Weight-Aware Multicast Routing Algorithm in Cognitive Wireless Mesh Networks
Yang Yiqing,Chen Zhigang,Kuang Zhufang,LiuHui
2015, 52(8):  1817-1830.  doi:10.7544/issn1000-1239.2015.20148255
Asbtract ( 1160 )   HTML ( 2)   PDF (4094KB) ( 650 )  
Related Articles | Metrics
Cognitive radio (CR) is an intelligent revolutionary spectrum (channel) sharing technology and one of the most important new wireless technologies today. Cognitive wireless mesh network (CWMN) is a combination of a wireless mesh network and the CR technology. Multicast routing and spectrum allocation is an important challenge in CWMNs. In this paper, we design a weight-aware multicast routing algorithm for CWMNs. A wireless links weights computing function and computing algorithm (LWC) is proposed, which is aware of the weight of multicast traffics. On this basis, a distributed multicast routing and spectrum allocation algorithm with QoS constraints in cognitive wireless mesh networks (WMRA) is proposed. Minimizing the channel collision value is the objective of WMRA. The priority factor is taken into account to prevent high-weight multicast sessions from incurring more collision than low-weight multicast sessions. Firstly, WMRA computes the weights of wireless links using LWC for constructing multicast tree. Secondly, WMRA computes the channel collision value distributed based on the dynamic programming. Thirdly, WMRA constructs the multicast routing path and performing spectrum allocation for the new multicast tree. Simulation results show that WMRA algorithm can achieve the expected goal and achieve a lower channel collision value.
Multi-Objective Channel Assignment and Gateway Deployment Optimizer for Wireless Mesh Network
Zhao Chuanxin,Chen Fulong,Wang Ruchuan,Zhao Cheng,Luo Yonglong
2015, 52(8):  1831-1841.  doi:10.7544/issn1000-1239.2015.20140675
Asbtract ( 1104 )   HTML ( 3)   PDF (3240KB) ( 869 )  
Related Articles | Metrics
Gateway deployment and channel assignment are important for the wireless mesh network planning because they influence the network quality of service directly. Traditionally, the two problems are studied separately. In this paper, a comprehensive strategy is proposed to minimize both the link collision and the cost of gateway deployment for wireless mesh network. In addition, the load balance is also considered in the planning stage and characteristics of the aggregation of flow traffic near the gateway in wireless mesh network are reflected by the degree of link collision. For the gateway deployment, it has been proved to be NP-hard. Here a novel multi-objective particle swarm algorithm is proposed to optimize both channel assignment and gateway deployment. The route of nodes is built through creating a tree algorithm after the channel are assigned and gateway are selected. Thus, the two problems are decoupled. The channel assignment and gateway deployment are then obtained in polynomial time for wireless mesh network planning. Comparing with the existing algorithms based on balanced channel repartition, the simulation results show that our proposed algorithm can reduce network collision effectively and improve network performance significantly, while reducing the path length and obtaining load balance of the gateways.
Fuzzy Support Vector Regression-Based Link Quality Prediction Model for Wireless Sensor Networks
Shu Jian,Tang Jin,Liu Linlan,Hu Gang,Liu Song
2015, 52(8):  1842-1851.  doi:10.7544/issn1000-1239.2015.20140670
Asbtract ( 987 )   HTML ( 3)   PDF (2796KB) ( 626 )  
Related Articles | Metrics
In wireless sensor networks (WSNs), link is a key element to achieve interconnects and multi-hop communication. Link quality is the fundamental of upper protocols, such as topology control, routing, and mobile management. The effective link quality prediction (LQP) can not only improve networks throughput and decrease node energy consumption, but also prolong network life time. In this paper, we give a concrete analysis about the related works on WSNs link quality prediction. A novel model, fuzzy support vector regression (FSVR), is proposed to predict link quality, which makes the impact of noise and outliers get high accuracy. The link quality samples are collected from three different scenarios. Taking the character of data distribution in unstable links into consideration, a kernel fuzzy c-means (KFCM) algorithm as an unsupervised learning algorithm, is applied to cluster the training set automatically in terms of partition coefficient and exponential separation (PCAES). The membership degree of samples is obtained to get fuzzy set for FSVR. The chaos particle swarm optimization (CPSO) algorithm is employed on each cluster in order to choose the suitable parameter combination for the model. The experimental results show that compared with the empirical risk-based BP neural network prediction methods, the proposed prediction model achieves higher accuracy and better generalization ability.
A Source Data Congestion Control Based on Sleep Schedule
Huang Junjie,Chen Xiaojiang,Liu Chen,Fang Dingyi,Wang Wei,Yin Xiaoyan,Wu Yueshan
2015, 52(8):  1852-1861.  doi:10.7544/issn1000-1239.2015.20140668
Asbtract ( 1182 )   HTML ( 1)   PDF (3330KB) ( 466 )  
Related Articles | Metrics
WSNs usually operate in duty-cycle mode for long life time monitoring. This operating mode makes the communication links in dynamic, which will bring a new congestion in the network—source data congestion (SDC). Source data congestion can lead to a WSNs node to get its buffer overflowed, cause data lost and even make no response to any forwarding requirements. This problem will get worse in the sensor heterogeneous WSNs for some nodes may generate data in a burst mode. Many solutions about network congestion only focus on making data forwarding to bypass the congested node, or controlling the traffic rate. These solutions have no help on the source data congestion because this congestion is caused by the improper duty-cycle mode. According to this situation, this paper analyses the fators that influence the source data congestion, and proposes a model, called conveyor belt model, to describe the probability of source data congestion accurately. Besides, based on this model, we propose a dormant schedule method, aiming at decreasing the probability of source data congestion by reconfiguring the sleeping timer of some nodes. Furthermore, our theoretical analysis and extensive simulation show that the model can exactly predict the source data congestion, and our method can decrease the probability of source data congestion significantly.
A Method of Provable Data Integrity Based on Lattice in Cloud Storage
Tan Shuang,He Li,Chen Zhikun,Jia Yan
2015, 52(8):  1862-1872.  doi:10.7544/issn1000-1239.2015.20140610
Asbtract ( 1123 )   HTML ( 4)   PDF (1629KB) ( 975 )  
Related Articles | Metrics
Using the cloud storage technology, users can outsource their data to the cloud. Such outsourcing meets the requirements of saving hardware costs and simplifying data management, because they no longer store any copies of the data in their local memory, and users cannot fully ensure whether the outsourced data are intact overall. Further, considering the client’s constrained computing power and the large size of the outsourced data, the client cannot take the extra time and effort to verify the data correctness in cloud environment. Therefore, ensuring the integrity of the outsourced data would lead to many security threats. In order to solve this problem, in this paper, we present lattice-based provable data integrity for checking the integrity of the data in the cloud. The proposed scheme not only detects any violations of client data in the cloud, but also has been proven to be safe in a random oracle. In particular, as opposed to schemes based on factoring or discrete log, the proposed scheme resists the cryptanalysis by quantum algorithms. Moreover, the proposed protocol has three other good attributes, namely support for data dynamics, computing on signed data, and multi-client verification. Finally, we present a comparison of the existing data integrity verification mechanism, as well as some open problems of lattice-based provable data integrity.
Game Optimization for Internal DDoS Attack Detection in Cloud Computing
Wang Yichuan, Ma Jianfeng, Lu Di,Zhang Liumei,Meng Xianjia
2015, 52(8):  1873-1882.  doi:10.7544/issn1000-1239.2015.20140608
Asbtract ( 1490 )   HTML ( 2)   PDF (2790KB) ( 725 )  
Related Articles | Metrics
A collaborative intrusion detection system (IDS) model, entitled virtual machine introspection & network-based IDS (VMI-N-IDS) is proposed, which is based on traditional introspection-based IDS and network-based IDS, for the defense of internal distributed denial of service (DDoS) attack threat of cloud cluster (e.g.cloud droplets freezing, CDF Attack). The CDF attack can exhaust the internal bandwidth of the cluster, the CPU and the memory resources of physical servers. Based on the game theory, IDS and attacker are treated as the two game parties in the VMI-N-IDS model. Utility functions of the two parties are supported, and it is proved that the game model is a non-cooperative and repeated game of incomplete information, and the subgame perfect Nash equilibrium is existent. Finally, the optimal defense strategy is proposed, which is the tradeoff between the false alarm rate and the malicious software size control, for solving the problem of dynamical adjustment strategy of internal intrude detection. The best strategy for the stages of IDS is to increase the threshold value β when the mathematical expectation of the suspicious value is greater than the load of server resources, and to reduce such value conversely. Experimental result shows that the proposed method can effectively defense the internal DDoS attack threat in the cloud environment.
A Novel Privacy Aware Secure Routing Protocol for HWMN
Lin Hui,Tian Youliang,Xu Li, Hu Jia
2015, 52(8):  1883-1892.  doi:10.7544/issn1000-1239.2015.20140606
Asbtract ( 1166 )   HTML ( 2)   PDF (3146KB) ( 577 )  
Related Articles | Metrics
Hybrid wireless mesh network (HWMN) is the most practical architecture in wireless mesh networks (WMNs), which provides a fusion of heterogeneous wireless networks, network connectivity and coverage to distributed users in a wide area to support applications in different domains, such as military, finance and healthcare. Due to the multi-hop and decentralized network architecture, HWMN is susceptible to various security threats, especially to the internal attacks aiming to the routing and privacy security. However, existing routing protocols for HWMN cannot ensure the security and protect the user privacy effectively as they are vulnerable to internal attacks. Moreover, few of them have taken the energy consumption into account at routing. To solve the above problem, combining with the characters of the HWMN, this paper proposes a dynamic reputation mechanism based privacy aware secure routing protocol (RPASRP), which combines a new dynamic reputation mechanism with the hierarchical key management protocol and hierarchical encryption scheme. Furthermore, the energy consumption is taken into account in the process of routing. Simulation results show that RPASRP can implement the security and privacy protection against the inside attacks more effectively and decrease the energy consumption in the process of routing.
A Fully Secure KP-ABE Scheme in the Standard Model
Zhang Minqing,Du Weidong,Yang Xiaoyuan,HanYiliang
2015, 52(8):  1893-1901.  doi:10.7544/issn1000-1239.2015.20140605
Asbtract ( 937 )   HTML ( 0)   PDF (987KB) ( 569 )  
Related Articles | Metrics
With the invention of many new applications such as social network and cloud storage, attribute-based encryption has been studied and applied widely because of its great flexibility, high efficiency and high security. As the current existing attributed-based encryption schemes are most selectively secure, which can’t meet the need of the reality well, how to construct fully secure ABE has been the focus of cryptography. Aimed at the problems mentioned above, an key-policy ABE scheme is firstly constructed by using the dual encryption system in this paper, then the scheme is proved to fully secure in the standard model with the new ideas proposed by Lewko and Waters. Finally the comparison results show that the public and private key lengths of our scheme are similar to the selectively secure GPSW scheme, but our scheme is more secure. Compared with the Lewko-Waters scheme, our scheme has the same security, but has shorter public and private key lengths, which is more efficient. What’s more, similar to the ciphertext-policy scheme of Lekwo-Waters, the techniques of selective security are also utilized in the security proof of our key-policy ABE, which is important in the research of the relation between the selective and full security models.
Using Code Mobility to Obfuscate Control Flow in Binary Codes
Chen Zhe,Wang Zhi,Wang Xiaochu,Jia Chunfu
2015, 52(8):  1902-1909.  doi:10.7544/issn1000-1239.2015.20140607
Asbtract ( 1585 )   HTML ( 7)   PDF (1519KB) ( 871 )  
Related Articles | Metrics
Code obfuscation is usually used in software protection and malware combating reverse engineering. There are some security issues in traditional code obfuscation methods, because reverse engineers can acquire all binary codes. To mitigate this problem, this paper presents a novel control flow obfuscation approach to protect the control flow of binary codes based on code mobility. Transforming the significant control logic codes to a remote trusted entity beyond adversary’s control makes some control flow information invisible at local untrusted execution environment, so that the binary code’s key behaviors cannot be predicted statically or dynamically. Non-conditional jump instructions without control information are used to replace some critical conditional jumps to hide branch conditions and jump target memory addresses, which increases the difficulty of collecting and reasoning about the program path information. We estimate this obfuscation approach in three aspects: potency, resilience and cost. And using this approach, we obfuscate the trigger conditions in six malware samples belonging to different families, and then use the state-of-the-art reverse engineering tools to reason about their internal control logic. Experimental result shows that our obfuscation approach is able to protect various branch conditions and reduce the leakage of branch information at run-time that impedes reverse engineering based on symbolic execution to analyze program’s internal logic.
Texture-Based Multiresolution Flow Visualization
Lu Daying,Zhu Dengming, Wang Zhaoqi
2015, 52(8):  1910-1920.  doi:10.7544/issn1000-1239.2015.20140417
Asbtract ( 1362 )   HTML ( 5)   PDF (5700KB) ( 875 )  
Related Articles | Metrics
When there is a discrepancy between flow-field resolution and screen resolution, traditional multiresolution texture advection methods easily result in the visual perception, including texture aliasing artifacts and a lack of detail. To address the issues, in this paper, we propose an adaptive multiresolution texture rendering algorithm for flow visualization. The algorithm is based on the construction of texture advection volume which is the intermediate representation of the underlying flow field. Using the intermediate geometry, the trajectory for flow texture advection can be more accurately obtained at arbitrary resolutions. The visual contrast of the texture is maintained by the mapping from texture space to advection volume space in combination with a novel texture blending approach considering the characteristics of the reference texture. Finally, the appropriate resolution for flow regions is selected adaptively through mip-mapping for advection volume and noise texture to avoid unsatisfactory rendering results as the user zooms in and out of the field. The validity of multiresolution algorithms is examined by the objective assessment metric based on particle position errors and the subjective assessment. Good consistency between the objective metric and subjective assessment is revealed by the high match rate. Experiment results also demonstrate the accurate traces and high-quality detail achieved by the proposed algorithm.
Activity Anomaly Detection Based on Vehicle Trajectory of Automatic Number Plate Recognition System
Sun Yuyan,Sun Limin,Zhu Hongsong,Zhou Xinyun
2015, 52(8):  1921-1929.  doi:10.7544/issn1000-1239.2015.20140673
Asbtract ( 1831 )   HTML ( 13)   PDF (3694KB) ( 1052 )  
Related Articles | Metrics
Anomaly detection acts as the major direction of intelligent traffic management, but current studies may not yield the best results in the field of public safety. This paper proposes a machine-learning based technique to detect vehicle anomalies from vehicle trajectory data captured by automatic number plate recognition (ANPR) system. Our scheme is capable of detecting vehicles with the behavior of wandering round and unusual activity at specific time. Firstly the spatial and temporal quantitative indicators of vehicle activity features are extracted from historical vehicle trajectory data. The vehicles with unusual spatial feature are found and their cumulative rotation angles around the centroid of the route are calculated to detect spatial wandering round behavior. The distance from the center of clusters created by K-means classification algorithm based on the temporal features vectors are computed to find outliers. We collecte the records from ANPR system with 315 cameras deployed in real-world for more than two months, and over 5.4 million vehicles are captured. The evaluation results based on the data set show the efficiency of the anomaly detection. More importantly, our scheme can significantly improve the detection robustness especially when the data collected by the ANRP system are noisy due to poor weather condition.