ISSN 1000-1239 CN 11-1777/TP

Table of Content

01 December 2017, Volume 54 Issue 12
Training Deep Neural Networks for Image Applications with Noisy Labels by Complementary Learning
Zhou Yucong, Liu Yi, Wang Rui
2017, 54(12):  2649-2659.  doi:10.7544/issn1000-1239.2017.20170637
Asbtract ( 1008 )   HTML ( 1)   PDF (3210KB) ( 697 )  
Related Articles | Metrics
In recent years, deep neural networks (DNNs) have made great progress in many fields such as image recognition, speech recognition and natural language processing, etc. The rapid development of the Internet and mobile devices promotes the popularity of image applications and provides a large amount of data to be used for training DNNs. Also, the manually annotated data is the key of training DNNs. However, with the rapid growth of data scale, the cost of manual annotation is getting higher and the quality is hard to be guaranteed, which will damage the performance of DNNs. Combining the idea of easy example mining and transfer learning, we propose a method called complementary learning to train DNNs on large-scale noisy labels for image applications. With a small number of clean labels and a large number of noisy labels, we jointly train two DNNs with complementary strategies and meanwhile transfer the knowledge from the auxiliary model to the main model. Through experiments we show that this method can efficiently train DNNs on noisy labels. Compared with current approaches, this method can handle more complicated noise labels, which demonstrates its value for image applications.
Protein Function Prediction Based on Multiple Networks Collaborative Matrix Factorization
Yu Guoxian, Wang Keyao, Fu Guangyuan, Wang Jun, Zeng An
2017, 54(12):  2660-2673.  doi:10.7544/issn1000-1239.2017.20170644
Asbtract ( 795 )   HTML ( 0)   PDF (2039KB) ( 470 )  
Related Articles | Metrics
Accurately and automatically predicting biological functions of proteins is one of the fundamental tasks in bioinformatics, and it is also one of the key applications of artificial intelligence in biological data analysis. The wide application of high throughput technologies produces various functional association networks of molecules. Integrating these networks contributes to more comprehensive view for understanding the functional mechanism of proteins and to improve the performance of protein function prediction. However, existing network integration based solutions cannot apply to a large number of functional labels, ignore the correlation between labels, or cannot differentially integrate multiple networks. This paper proposes a protein function prediction approach based on multiple networks collaborative matrix factorization (ProCMF). To explore the latent relationship between proteins and between labels, ProCMF firstly applies nonnegative matrix factorization to factorize the protein-label association matrix into two low-rank matrices. To employ the correlation between labels and to guide the collaborative factorization with proteomic data, it defines two smoothness terms on these two low-rank matrices. To differentially integrate these networks, ProCMF sets different weights to them. In the end, ProCMF combines these goals into a unified objective function and introduces an alternative optimization technique to jointly optimize the low-rank matrices and weights. Experimental results on three model species (yeast, human and mouse) with multiple functional networks show that ProCMF outperforms other related competitive methods. ProCMF can effectively and efficiently handle massive labels and differentially integrate multiple networks.
EAE: Enzyme Knowledge Graph Adaptive Embedding
Du Zhijuan, Zhang Yi, Meng Xiaofeng, Wang Qiuyue
2017, 54(12):  2674-2686.  doi:10.7544/issn1000-1239.2017.20170638
Asbtract ( 881 )   HTML ( 4)   PDF (3106KB) ( 545 )  
Related Articles | Metrics
In recent years a drastic rise in constructing Web-scale knowledge graph (KG) has appeared and the deal with practical problems falls back on KG. Embedding learning of entities and relations has become a popular method to perform machine learning on relational data such as KG. Based on embedding representation, knowledge analysis, inference, fusion, completion and even decision-making could be promoted. Constructing and embedding open-domain knowledge graph (OKG) has mushroomed,which greatly promots the intelligentization of big data in open domain. Meanwhile, specific-domain knowledge graph (SKG) has become an important resource for smart applications in specific domain. However, SKG is developing and its embedding is still in the embryonic stage. This is mainly because there is a germination in SKG due to the difference for data distributions between OKG and SKG. More specifically: 1) In OKG, such as WordNet and Freebase, sparsity of head and tail entities are nearly equal, but in SKG, such as Enzyme KG and NCI-PID, inhomogeneous is more popular. For example, the tail entities are about 1000 times more than head ones in the enzyme KG of microbiology area. 2) Head and tail entities can be commuted in OKG,but they are noncommuting in SKG because most of relations are attributes. For example, entity “Obama” can be a head entity or a tail entity, but the head entity “enzyme” is always in the head position in the enzyme KG. 3) Breadth of relation has a small skew in OKG while imbalance in SKG. For example, a enzyme entity can link 31809 x-gene entities in the enzyme KG. Based on observation, we propose a novel approach EAE to deal with the 3 issues. We evaluate our approach on link prediction and triples classification tasks. Experimental results show that our approach outperforms Trans(E, H, R, D and TransSparse) significantly, and achieves state-of the-art performance.
A Link Prediction Model for Clinical Temporal Knowledge Graph
Chen Dehua, Yin Suna, Le Jiajin, Wang Mei, Pan Qiao, Zhu Lifeng
2017, 54(12):  2687-2697.  doi:10.7544/issn1000-1239.2017.20170640
Asbtract ( 2200 )   HTML ( 18)   PDF (2758KB) ( 956 )  
Related Articles | Metrics
Link prediction on knowledge graph is the main task of knowledge base completion, predicting whether a relationship existing in the knowledge base is likely to be true. However, traditional knowledge link prediction models are only appropriate for static data rather than temporal knowledge base. Temporal knowledge base exists on various fields. Take medical medicine field as example, diabetes is a typical chronic disease which evolves slowly. Thus, link prediction on clinical knowledge base such as diabetic complication requires the analysis on temporal characteristic of temporal knowledge base, which is a great challenge for traditional link prediction models. Thus, to address the prediction of temporal knowledge base, this paper proposes a long short-term memory (LSTM) based model for temporal knowledge base. The proposed model adopts memory cells of LSTM for sequential learning, and then builds incremental learning layer. Afterwards, timing characteristics can be extracted by the way of end-to-end, which realizes the prediction on temporal knowledge base. In experiments, the proposed model in clinical temporal knowledge base shows significant improvements compared with baselines including Rescal, NTN, TransE, TransH, TransR and DNN.
Indoor Trajectory Tracking Algorithm Based on Time Series Heuristic Information
Qin Junping, Deng Qingxu, Sun Shiwen, Renqing Daoerji, Tong Haibin, Su Xianli
2017, 54(12):  2698-2710.  doi:10.7544/issn1000-1239.2017.20160803
Asbtract ( 806 )   HTML ( 0)   PDF (4081KB) ( 426 )  
Related Articles | Metrics
Existing indoor trajectory tracking algorithms on wireless sensor network are based on continuous localization and can not make use of the heuristic information of RSSI time series within a certain temporal and spatial range. The heuristic information of RSSI time series is a key factor of trajectory tracking procedure. This paper proposes a new trajectory tracking algorithm on spatiotemporal correlation model based on heuristic information. According to the heuristic information related to moving trajectory, the new method contains the following essential phases. Firstly, we model the trajectory tracking model reflecting spatiotemporal correlation and statistical characteristics. Secondly, we detect spanning boundary event and judge which subarea the unknown node was in by means of information fusion of RSSI time series and moving least square method. Finally, the moving trajectory of unknown node is formed by means of dynamic time warping fingerprinting matching algorithm with heuristic information constraints. The principles of information fusion are strictly proved in mathematics. The field experiments and the simulation experiments show that the algorithm has good environment adaptability, smooth trajectory and the error does not accumulate among the subareas. Compared with the existing methods, the accuracy of trajectory tracked is improved.
EasiRCC: A Method of Rule-Matching and Conflict Resolution for Smart Home
Huang Xiaohui, Li Dong, Shi Hailong, Cui Li
2017, 54(12):  2711-2720.  doi:10.7544/issn1000-1239.2017.20160646
Asbtract ( 559 )   HTML ( 2)   PDF (2768KB) ( 355 )  
Related Articles | Metrics
With the rapid development of the Internet of things, smart home based on Internet of things has gradually entered into family life of people. The user can accord to the requirement for personalized and customized service life. However, smart home systems appear conflicts between rules frequently, because the number of rules is becoming more and more. Therefore, this paper proposes EasiRCC, a new-type method of rapid rule-matching and conflict resolution. Resolving the conflict problem mainly focuses on rule-matching and conflict resolution. Firstly, for the problem of rule-matching, because repeating matching takes place in the existing methods frequently, we propose an algorithm named EasiRMA, which is used to fast match rule based on Hash function and raise the efficiency of rule-matching. Secondly, for the problem of conflict resolution, we come up with a scheduling mechanism of mixed-priority, which the system can adaptively adjust priority of rules, and resolve rule-conflict in time. Experimental results show that EasiRCCs efficiency of rule-matching is not changing as the number of rules is increased, and its running time is constant, but the running time of traditional matching method is O(N), and EasiRCC can effectively resolve conflict in the condition that does not affect users normal household lives.
The Dynamical Prediction of V2V Link Duration in Urban VANETs
Wang Xiufeng, Cui Gang, Wang Chunmeng
2017, 54(12):  2721-2730.  doi:10.7544/issn1000-1239.2017.20158391
Asbtract ( 619 )   HTML ( 7)   PDF (4057KB) ( 432 )  
Related Articles | Metrics
Link duration prediction is an important standard which determines many performance of network in VANETs. Existing analytical methods about link duration based on mobility of nodes in VANETs have no function to predict link duration between any two nodes in the future, so it is not practical for these methods to predict link duration between two vehicles. We propose a dynamical prediction model which considers the distribution of relative velocity, inter-vehicle distance, traffic density change and traffic light to estimate the expected link duration between any pair of connected vehicles, because these factors change continuously in the process of link connection. By taking into account the relative velocity distribution, the model is able to adjust the principle in real time to adapt variation of vehicle speed. By automatically adjusting computing method of the relative distance between two vehicles, DPLD(dynamically predict link duration) model can automatically adapt to the change of relative distance between two vehicles. Therefore, DPLD model can effectively predict the link duration between the two vehicles. Such model is implemented on each vehicle along with parameters estimation methods of relative velocity distribution, exponential moving average method processes speed exception and considering the impact of the traffic light on link duration. Simulation results show that this model predict link duration for urban scenario has the high accuracy.
A Time Window Based Lightweight Real-Time Activity Recognition Method
Dong Lihua, Liu Qiang, Chen Haiming, Cui Li
2017, 54(12):  2731-2740.  doi:10.7544/issn1000-1239.2017.20150462
Asbtract ( 620 )   HTML ( 0)   PDF (1986KB) ( 392 )  
Related Articles | Metrics
Lack of exercise or excessive exercise would damage our body, so real-time human activity recognition using smart-phones or wearable devices can keep people aware of their physical status and contribute to proper exercise. However, most of existing high-precision activity recognition algorithms cannot directly apply to smart-phones or wearable devices due to their high computational cost and large storage request. Meanwhile, these algorithms cannot update recognition model according to user behavior. We propose a time window based lightweight real-time activity recognition method (EasiSports) which uses SVM to recognize five kinds of human activity status including sitting, walking, running, walking upstairs and downstairs with one accelerate sensor and fair low computational cost. Our method can set up personalized activity recognition model in many cases using k-means clustering and preset data to further improve the accuracy and reduce computational cost. Experiments show that our method can achieve an accuracy of 87.45% in recognizing five kinds of human activity status above mentioned. The performance of our algorithm can improve more than 30% compared with other algorithms.
Acoustic Self-Calibrating Indoor Localization System via Smartphones
Lin Feng, Zhang Lei, Li Guinan, Wang Zhi
2017, 54(12):  2741-2751.  doi:10.7544/issn1000-1239.2017.20160727
Asbtract ( 744 )   HTML ( 3)   PDF (4645KB) ( 469 )  
Related Articles | Metrics
Growing needs for the indoor location based service (ILBS) bring newer and higher requirements for indoor localization systems, such as high accuracy, hardware compatibility, low cost for commercial application, instantaneity and fast data update rate etc. In order to meet those requirements with commercial smartphone platform, we design an indoor localization system named LinLoc, which includes a new self-calibrating approach and a new localization method. Based on TPSN ranging, LinLoc applies time-of-arrival (TOA) method with acoustic signals to achieve real-time users' localization on normal commercial smartphone platform. With no extra time synchronization need, it can achieve centimeter-level accuracy. Furthermore, we propose a new self-calibrating approach based on acoustic TPSN ranging and semidefinite programming (SDP) algorithm. Through the interaction of every anchor nodes in the network, the new approach helps to solve the problem of self-calibrating in large-scale anchor network, and also helps to remove the heavy maintenance requirements afterwards. Then, LinLoc system which consists of a special-designed anchor network, smartphones installed with real-time app inside, and a backend server for processing is implemented. Simulations and experiments have been performed. The results show that LinLoc has nice indoor localization performance and its accuracy can be 0.05~0.3m, which provides accurate ILBS for users.
Research on Key Technologies of RFID based Device Free Gesture Recognition
Wang Xuan, Fang Hechuan, Chang Liqiong, Wang Ju, Chen Xiaojiang, Fang Dingyi, Peng Yao, Chen Feng
2017, 54(12):  2752-2760.  doi:10.7544/issn1000-1239.2017.20160648
Asbtract ( 919 )   HTML ( 4)   PDF (2974KB) ( 468 )  
Related Articles | Metrics
As a vital component of human-computer interaction, gesture recognition has gained a lot of attentions in civilian applications recently. Many applications would benefit from such gesture recognition, e.g. smart phone, smart home system, Kinect, etc. In this work, we introduce an RFID (radio frequency identification) based solution to recognize gestures using COTS (commercial off-the-shelf) RFID tags and readers. The basic idea is treating signal features caused by perturbation of gestures as the fingerprint. Furthermore, we leverage multipath to increase the difficulty of matching, which provides a high recognition rate. Unlike past work, which required user to attach an RFID tag, our method does not require any device to be attached to users. Specifically, we solve the problem that RFID communication is discontinuous in time domain by data slicing. We then capture and extract characteristic matrix corresponding to each gesture by using a synthetic aperture radar (SAR). Finally, we adopt dynamic time warping (DTW) techniques to recognize gestures by matching with the priori gesture fingerprint database. We implement our method using a COTS RFID device and evaluate it with 6 commercial tags. Experimental results demonstrate that this device free method is feasible and it can achieve a correct recognition rate of about 85%.
Path and Port Address Hopping Based SDN Proactive Defense Technology
Zhang Liancheng, Wei Qiang, Tang Xiucun, Fang Jiabao
2017, 54(12):  2761-2771.  doi:10.7544/issn1000-1239.2017.20160461
Asbtract ( 733 )   HTML ( 3)   PDF (3260KB) ( 441 )  
Related Articles | Metrics
Existing path hopping technologies are not so efficient for defending global network interception and analysis attackers, and existing port and address hopping technologies spend too much effect on hopping synchronization and are difficult to be deployed. In order to mitigate these problems, a path and port address hopping based SDN proactive defense (PPAH-SPD) scheme, making full use of SDN network characteristics (such as control plane and data plane separation, logically centralized control) and introducing of multi-path routing, is proposed. PPAH-SPD scheme models the path hopping problem as a constraint solving problem, and utilizes satisfiability modulo theory solver to obtain multiple available paths, which satisfy overlap and capacity constraints. According to path hopping strategy and specific hopping interval, SDN controller installs corresponding flow entries into all OpenFlow switches along every specific path, and these switches can then use these flow entries to properly forward the corresponding network flows, and simultaneously change their address and port information. Theoretical analysis and experimental results show that PPAH-SPD scheme can not only achieve transmission path hopping and port and address random hopping along every single transmission path with comparatively small communication time delay and computation overhead, and but also improve proactive defense capability of SDN network to resist global network interception and analysis attack, denial of service attack and insider threat.
Strongly Secure Anonymous Implicit Authentication and Key Agreement for Roaming Service
Chen Ming
2017, 54(12):  2772-2784.  doi:10.7544/issn1000-1239.2017.20160485
Asbtract ( 684 )   HTML ( 1)   PDF (1465KB) ( 412 )  
Related Articles | Metrics
The existing two-party authentication and key agreement protocols for roaming service are provably secure in the CK model, and do not resist the attack of ephemeral secrets reveal. Based on elliptic curve cryptography and identity-based cryptosystem, we propose an anonymous two-party authentication and key agreement scheme for roaming service. The new scheme, based on the Schnorr signature, achieves mutual implicit authentication by a well designed “challenge-response” signature which is similar to the one in the HMQV protocol. We extend the ID-BJM model, a widely used security model for analyzing identity-based authenticated key agreement protocols, to simulate two-party authentication and key agreement schemes for roaming service. Furthermore, we demonstrate that the new scheme is eCK secure under the extended ID-BJM model, and that the security of the new scheme can be reduced to solve (by a polynomial-time adversary) computational Diffie- Hellman problems on an elliptic curve over finite fields. Comparative analysis shows that the new scheme has stronger security, achieves resistant to ephemeral secrets reveal, needs fewer cryptography libraries, and has lower computing, communication and storage overheads. The new scheme can be used to provide secure roaming authentication for resource constrained mobile terminals in global mobility networks, Internet of things or ubiquitous networks.
An Efficient Association Rule Hiding Algorithm Based on Cluster and Threshold Interval
Niu Xinzheng, Wang Chongyi, Ye Zhijia, She Kun
2017, 54(12):  2785-2796.  doi:10.7544/issn1000-1239.2017.20160612
Asbtract ( 558 )   HTML ( 1)   PDF (4073KB) ( 357 )  
Related Articles | Metrics
Association rules hiding is a very important method of privacy-preserving data mining (PPDM). Because the current association rules hiding algorithm operates the transaction database directly, it leads to a lot of I/O overhead. To solve this problem, we put forward a quick association rules hiding algorithm based on FT-tree, called FP-DSRRC. Firstly, the algorithm improves the structure of FP-tree by adding an index to the transaction number and establishing the bidirectional traverse structure. Then FP-DSRRC uses the improved FP-tree to quickly handle transaction data set, avoiding a large number of I/O overhead caused by traversal the raw transaction data set. Furthermore, FP-DSRRC finds the sensitive items quickly by building and maintaining a transaction index table, and then handles the association rules based on the clustering strategy. We eliminate the sensitive rules by clusters, and reduce the negative influence caused by association rules hiding progress to the original data set by adopting the idea of rule support and confidence degree interval at the same time. Finally, the experiment shows that compared with traditional association rules hiding algorithm, the executive time of FP-DSRRC has been decreased by 50%~70% while guaranteeing the quality of general data, moreover, FP-DSRRC has better availability on a large-scale real data set.
Key Agreement Protocols of Non-Signature Authentication Based on Binary Tree
Wu Fusheng, Zhang Huanguo
2017, 54(12):  2797-2804.  doi:10.7544/issn1000-1239.2017.20160791
Asbtract ( 760 )   HTML ( 3)   PDF (1466KB) ( 431 )  
Related Articles | Metrics
Protocol is the specification of the network communication. Then cryptographic protocol, whose safety is based on signature or authentication technology, is one of the key techniques of information security. The technique of signature or authentication needs huge computation during communicating, which brings barriers for many communication devices because of their limited computing power. Therefore, it is an aim of studying information security to design a secure cryptographic protocol, which is practical but doesn't need huge computation. In this paper, a novel key agreement protocol is proposed, which is based on the binary tree and the homomorphic mapping of integer multiplication. Meanwhile, an experiment is carried out in an open source (OpenSSL) systems to test how nodes of leaf binary trees affect network communication and the result of the experiment is analyzed. Our scheme is successful because our key agreement protocol is proved to be safe in the random oracle model. That is to say, in the PKI system, our key agreement protocol meets the requirement of the indistinguishable chosen plaintext attack (IND-CPA ) security. Compared with previous protocols (like MTI, MQV, HMQV), our key agreement protocol has many advantages: the computation is small; only one strong security assumption is needed; it dispenses with extra authentication of MAC and digital signature; and communicating parties can authenticate implicitly through unsafe channels.
An Algorithm for Differential Privacy Streaming Data Adaptive Publication
Wu Yingjie, Zhang Liqun, Kang Jian, Wang Yilei
2017, 54(12):  2805-2817.  doi:10.7544/issn1000-1239.2017.20160555
Asbtract ( 775 )   HTML ( 4)   PDF (5223KB) ( 473 )  
Related Articles | Metrics
Nowadays, many practical applications need to publish streaming data continuously. Most of existing research works for differential privacy single streaming data publication focus on range accumulation. However, many practical scenarios need to answer arbitrary range counting queries of streaming data. At the same time, there exist specific rules of queries from users, so adaptive analysis and calculation for historical queries should be concerned. To improve the usability of published data, an algorithm HQ_DPASP for differential privacy streaming data adaptive publication based on historical queries is proposed. Combining the characteristics of streaming data, HQ_DPASP firstly uses the sliding window mechanism to construct the differential privacy range tree of the streaming data dynamically. Secondly, by analyzing the coverage probability of tree nodes and calculating the privacy parameters from leaves to root, HQ_DPASP allocates privacy budget from root to leaves and adds non-uniform noise on tree nodes. Finally, the privacy budget of tree nodes and tree's parameters are adjusted adaptively based on the characteristic of historical queries. Experiments are designed for testing the feasibility and effectiveness of HQ_DPSAP. The results show that HQ_DPSAP is effective in answering arbitrary range counting queries on the published streaming data while assuring low mean squared error of queries and high algorithm efficiency.
Optimization for Broadcast Encryption in Cloud Using Extended Public Key
Li Chunhua, Wang Hua, Zhang Yanzhe, Zhou Ke
2017, 54(12):  2818-2824.  doi:10.7544/issn1000-1239.2017.20170902
Asbtract ( 621 )   HTML ( 0)   PDF (1698KB) ( 324 )  
Related Articles | Metrics
Security issues have been a major hurdle for the application of cloud storage. As data encryption is the mainstream method to ensure confidentiality, users always share their data by means of key's management and distribution. However, how to manage massive keys and distribute them securely and efficiently is a challenge in cloud storage. In recent years, broadcast encryption scheme has been paid more attention by researchers to mitigate above problems for cloud data sharing. Since current schemes take insufficient account of changes of users and users's privilege, they do not perform well in cloud. To reduce the overhead of key distribution, an optimization method is proposed for public-key based broadcast encryption in this paper. First, the scope of public keys is expanded to two or more times and the initial related parameters used for generating public keys are kept simultaneously. These parameters can ensure private keys distributed previously still available when they are employed to generate the new public keys for new valid users, thus greatly decreases the cost of redistributing private keys. Second, lazy revocation is adopted to reduce the cost of updating keys. Experimental results show that our optimized method outperforms the existing schemes while adding new users and revoking users' privilege in cloud.
Trajectory Privacy Preserving Based on Statistical Differential Privacy
Zhu Weijun, You Qingguang, Yang Weidong, Zhou Qinglei
2017, 54(12):  2825-2832.  doi:10.7544/issn1000-1239.2017.20160647
Asbtract ( 889 )   HTML ( 5)   PDF (1721KB) ( 519 )  
Related Articles | Metrics
With the continuous development of Internet of vehicles, Internet of vehicles provides the convenient services to drivers and passengers. But it also brings some new problems of privacy protection. The existing methods for trajectory data publishing may leak users' location privacy. Thus, it may endanger the users' personal safety. In order to avoid the drawbacks of adding random noise in the existing methods for differential privacy protection, we propose a novel method for trajectory privacy protection based on statistical differential privacy. At first, one can calculate the sensitivity of position nodes in vehicle traces according to the characteristics of traces since there are some characteristics of Markov process in vehicle traces. And then, one can add some moderate Laplace noises according to the sensitivity of position nodes, statistical threshold and sensitivity threshold. As a result, the new method is obtained. Evaluating the availability of the trajectory data through the average relative error, the experimental results verify the availability and effectiveness of the proposed approach for privacy preserving based on statistical differential privacy.
Design of Block Cipher Processor Based on Stream Architecture
Li Gongli, Dai Zibin, Xu Jinhui, Wang Shoucheng, Zhu Yufei, Feng Xiao
2017, 54(12):  2833-2842.  doi:10.7544/issn1000-1239.2017.20160670
Asbtract ( 566 )   HTML ( 2)   PDF (2244KB) ( 286 )  
Related Articles | Metrics
To improve the performance of cipher processor, the performance model of cipher processor is proposed. And based on this model, three ways for enhancing cipher processor's performance are presented, which are sharing multi-level resources, binding operations of pre-xor or post-xor and maximizing parallelism of block cipher algorithms. According to these improvement methods, the type and amount of cryptographic function units are determined. However, the function units are not only numerous but also different in operation data width and latency, so how to organize these function units efficiently becomes a key problem. The stream processor architecture can integrate a large number of function units to obtain high performance. Then, we design and implement a reconfigurable block cipher processor prototype which is based on stream processor architecture, and in order to organize the numerous function units effectively, the function units are divided into basic units, inter-bank-shared units and inter-cluster-shared units respectively according to their processing width. The prototype is synthesized in 65nm CMOS process and several typical block cipher algorithms are mapped on it. The evaluation results show that the processor prototype is area-efficient and its performance is not only beyond that of international application specific instruction cipher processors, but also higher than that of the domestic reconfigurable array processors.
Phase Transition Properties of k-Independent Sets in Random Graphs
Lu Youjun, Xu Daoyun
2017, 54(12):  2843-2850.  doi:10.7544/issn1000-1239.2017.20160694
Asbtract ( 696 )   HTML ( 0)   PDF (1700KB) ( 421 )  
Related Articles | Metrics
Phase transition property is one of the most important properties of the theory of Erds-Rényi random graphs. A subset of vertices is a k-independent set in a simple undirected graph G=(V,E) if the subset is an independent set containing k vertex. In order to understand the structural property of k-independent sets in Erds-Rényi random graphs, the phase transition properties of k-independent sets in Erds-Rényi random graphs are investigated in this paper. It is shown that the threshold probability is p\-c=1-n\+-2/k-1 for the existence of k-independent sets in random graph G(n,p) via the first moment method and the second moment method when 2≤k=ο(n). According to this fact that random graph G(n,p) is equivalent to random graph G(n,m) when m is close to pC\+2\-n, the threshold edge number is given by m\-c=[n(n-1)/2(1-n\+-2/k-1)] for the existence of k-independent sets in random graph G(n,m). The simulation results show that the consistence between simulation and theoretical threshold value for the existence of k-independent sets in random graph G(n,p) and G(n,m) when 2≤k=ο(n), and the threshold value is related to the total number n of vertices and the number k of vertices of independent set. However, when k=ω(n), the theoretical threshold value is not consistent with the simulation threshold value for the existence of k-independent sets in random graph G(n,p) and G(n,m).
DyBGP: A Dynamic-Balanced Algorithm for Graph Partitioning Based on Heuristic Strategies
Li Qi, Zhong Jiang, Li Xue
2017, 54(12):  2851-2857.  doi:10.7544/issn1000-1239.2017.20160690
Asbtract ( 860 )   HTML ( 8)   PDF (1511KB) ( 462 )  
Related Articles | Metrics
With the development of computing technology and the advent of the era of big data, the distributed computing has became a research hotspot. Iterative computation of big graph becomes the focus of the research. Reducing the communication data quantity between subgraph after effective partitioning, it is the key to improve the computational performance, because the existing algorithms are difficult to meet the requirements on both minimizing fraction of egdes cut and load balancing at the same time. In this paper, a dynamic-balanced algorithm for graph partitioning named DyBGP is proposed, and it is used to solve the problem of balanced partition. Based on ensuring the partitioning of subgraph boundary vertices optimal, the perturbation strategy to jump out of local optimum to expand the search space is used. Finally, our algorithm is verified the feasibility in the real-world graph, respectively from the balance coefficient and the scale of edges cut compared with the traditional algorithms, such as Hash, Chunk and Metis. In the number of edges cut, it is decreased about 40%, 30%, 5% with our algorithm under specifying perturbation times. In the balance coefficient, our algorithm is more optimized than Metis. The experimental results show that the algorithm is effective.
Criticality Checkpoint Management Strategy Based on RDD Characteristics in Spark
Ying Changtian, Yu Jiong, Bian Chen, Wang Weiqing, Lu Liang, Qian Yurong
2017, 54(12):  2858-2872.  doi:10.7544/issn1000-1239.2017.20160717
Asbtract ( 836 )   HTML ( 3)   PDF (5567KB) ( 351 )  
Related Articles | Metrics
The default fault tolerance mechanism of Spark is setting the checkpoint by programmer. When facing data loss, Spark recomputes the tasks based on the RDD lineage to recovery the data. Meanwhile, in the circumstance of complicated application with multiple iterations and large amount of input data, the recovery process may cost a lot of computation time. In addition, the recompute task only considers the data locality by default regardless the computing capabilities of nodes, which increases the length of recovery time. To reduce recovery cost, we establish and demonstrate the Spark execution model, the checkpoint model and the RDD critically model. Based on the theory, the criticality checkpoint management (CCM) strategy is proposed, which includes the checkpoint algorithm, the failure recovery algorithm and the cleaning algorithm. The checkpoint algorithm is used to analyze the RDD charactersitics and its influence on the recovery time, and selects valuable RDDs as checkpoints. The failure recovery algorithm is used to choose the appropriate nodes to recompute the lost RDDs, and cleaning algorithm cleans checkpoints when the disk space becomes insufficient. Experimental results show that: the strategy can reduce the recovery overhead efficiently, select valuable RDDs as checkpoints, and increase the efficiency of disk usage on the nodes with sacrificing the execution time slightly.