ISSN 1000-1239 CN 11-1777/TP

Table of Content

01 March 2019, Volume 56 Issue 3
EasiDARM: Distributed Based Adaptive Register Method for Internet of Things
Shi Yahu, Shi Hailong, Cui Li
2019, 56(3):  453-466.  doi:10.7544/issn1000-1239.2019.20170667
Asbtract ( 160 )   HTML ( 4)   PDF (4936KB) ( 117 )  
Related Articles | Metrics
With the perfecting and practicability of the access protocols of the Internet of things (IoT), it is becoming a mainstream architecture to connect devices to cloud platforms for support of real-time access. Considering the dynamic nature of IoT devices and networks, we must make the devices register on the platforms periodically to ensure real-time access between the devices and the platforms. However, existing register methods cost much time and consume a lot of resources on devices, Internet and platforms; and what’s more, the platforms will be overload when many devices or networks are dynamic. In this paper, we present a distributed based adaptive register method (EasiDARM) to reduce the costs of resources and time. By using multiple devices to complete the measurement of network address translation (NAT) timeout which is complex and time-consuming, and sharing the result of measurement among the devices, EasiDARM accelerates the processes of the measurement and self-adaption. The process of measurement is divided into fast update (FU) and fast converge (FC) stage, EasiDARM assigns and probes the candidate intervals with exponential growth during the FU stage to extend the register interval rapidly, and with linear growth during the FC stage to converge the measurement quickly. Through experiments show that compared with the traditional adaptive register method, EasiDARM can reduce 46% of time, 46% of consumption on devices and 53% of consumption on Internet and platforms in the process of measurement, and cut 36% of instantaneous consumption on platforms.
City-Level IP Geolocation Method Based on Network Node Clustering
Li Mingyue, Luo Xiangyang, Chai Lixiang, Yuan Fuxiang, GanYong
2019, 56(3):  467-479.  doi:10.7544/issn1000-1239.2019.20170473
Asbtract ( 302 )   HTML ( 1)   PDF (4791KB) ( 157 )  
Related Articles | Metrics
Existing city-level target IP geolocation method based on network topology heuristic clustering (HC-Based method) clusters IP nodes by simple voting rules, which is liable to cause a lot of errors in geolocation results. This paper presents a city-level IP geolocation method based on network node clustering, referred to as the NNC method. This method firstly uses the principle that the same network community locates in the same metropolitan area network. Considering the characteristics of the module that can accurately measure the strength of the network community structure, the network topology is clustered based on the modular optimization, and the network community with the highest module degree is obtained. Then the IP geography database voting rules is used to determine the location of the network community. Finally, depending on the network community where the target IP is located in, the city where the target IP is located in can be determined. Experimental results of 15 000 Internet IP nodes in five provinces (Henan, Shandong, Shaanxi, Guangdong and Zhejiang) of China show that compared with HC-Based method, the proposed method can significantly improve the accuracy and recall rate of the target IP, and reduce the effect of the inaccurate landmarks on the location results.
Sparse Framework Based Static Taint Analysis Optimization
Wang Lei, He Dongjie, Li Lian, Feng Xiaobing
2019, 56(3):  480-495.  doi:10.7544/issn1000-1239.2019.20180071
Asbtract ( 242 )   HTML ( 0)   PDF (3358KB) ( 112 )  
Related Articles | Metrics
At present, privacy preserving is an important research challenge of information system security. Privacy leak detection for applications is an effective solution for privacy preserving. Taint analysis can effectively protect the confidentiality and integrity of information in the system, and report the privacy leak risk of applications in advance. However, the existing static taint analysis tool still has the problem of high analysis overhead especially in high sensitive mode. This work first deeply analyzes that there exists a large number of unrelated propagation which leads to unnecessary expenses in current IFDS-based taint analysis, and statistical results show that the proportion of it is up to 85.5%. Aiming at this problem, this paper attempts to use an effective optimization method, sparse optimization in recent years, to eliminate the unrelated propagation in static taint analysis, and achieve the optimization of time and space cost. We have innovatively extended the classic data flow analysis framework (IFDS) into a sparse form, and provide a corresponding taint analysis algorithm. We implemented a tool called FlowDroidSP. Experimental results show that the tool has 4.8 times of time acceleration and 61.5% memory reduction compared with the original FlowDroid under the non-pruning mode. Under pruning mode, it has an average time of 18.1 times speedup and 76.1% memory reduction.
Leveled Fully Homomorphic Encryption Against Adaptive Key Recovery Attacks
Li Zengpeng, Ma Chunguang, Zhao Minghao
2019, 56(3):  496-507.  doi:10.7544/issn1000-1239.2019.20170443
Asbtract ( 177 )   HTML ( 1)   PDF (1284KB) ( 112 )  
Related Articles | Metrics
A major open problem is to protect leveled homomorphic encryption from adaptive attacks that allow an adversary to learn the private key. In order to achieve the goal of preventing key recovery attacks on fully homomorphic encryption (FHE), Li Zengpeng et al (PROVSEC’16) proposed an multiple secret keys fully homomorphic encryption scheme under the learning with errors (LWE) assumption to prevent key recovery attacks on FHE, which did not use the notion of “valid ciphertexts” of Loftus et al (SAC’11). However, utilizing the information of noise, the attacks can still recover the information of the secret key. Li Zengpeng et al.’s scheme cannot provide an efficient method to protect the secret key. In this paper, Inspired by the work of Li Zengpeng et al (EPRINT’16), we first give a new method of key recovery attacks to Li Zengpeng et al.’s scheme; then, we propose a new FHE scheme with multiple secret keys which differs from EPRINT’16, and prove our new scheme against key recovery attacks. Our main idea is to adopt the dual version of encryption algorithm and generate a “one-time” secret key every time, so that even if an attacker can learn some bits of the one-time private key from each decryption query and cannot obtain some bits of noise, the scheme still does not allow them to compute a valid private key.
Graph Degree Histogram Publication Method with Node-Differential Privacy
Zhang Yuxuan, Wei Jianghong, Li Ji, Liu Wenfen, Hu Xuexian
2019, 56(3):  508-520.  doi:10.7544/issn1000-1239.2019.20170886
Asbtract ( 322 )   HTML ( 6)   PDF (3038KB) ( 126 )  
Related Articles | Metrics
The widespread use of various information systems, e.g. social networks, mail systems and recommendation systems, has produced a large amount of graph data. Publishing and sharing these data under the edge or node differential privacy can fully utilize their potential value, meanwhile, the privacy of the involved users can be preserved. Compared with the edge differential privacy, the node differential privacy can effectively prevent users from being re-identified. However, it will lead to a higher sensitivity of the query function at the same time. To conquer this problem, a novel method named sequence edge-removal (SER) is proposed, based on which two graph degree distribution histogram publication mechanisms under node difference privacy are put forward. The experiment results illustrate that the SER method can effectively suppress the sensitivity of the publishing mechanism, and also can retain more edges of the original graph. In addition, it decreases the errors between the published data and the original data. Compared with available works, under the constraint of providing the same level of privacy preservation, the proposed histogram publishing mechanism based on the SER method can describe the degree distribution of the original data more accurately, and thus improves the usability of the published data.
Publicly Verifiable Database Model with Full Operations Based on Bilinear Map
Wang Qiang, Zhou Fucai, Xuan Pengkai, Wu Qiyu
2019, 56(3):  521-532.  doi:10.7544/issn1000-1239.2019.20170839
Asbtract ( 172 )   HTML ( 2)   PDF (2644KB) ( 93 )  
Related Articles | Metrics
The existing verifiable outsourced database schemes only support some kind of tailored queries with low efficiency and large data expanding rate. Besides, the overheads of verification and update of these schemes are unacceptable. As a result, they cannot be applied into practice. To resolve these problem, we propose a novel primitive called publicly verifiable outsourced database with full operations based on bilinear map. We present a system model and security model of our scheme. Based on bilinear map, we construct a publicly verifiable outsourced database scheme with full operations, and design each algorithm in detail. We present the rigorous security proof under q-BSDH assumption and VBDHE assumption. Finally, we make a comparison with other state of art schemes in two directions: functionality and performance. The theoretical analysis and simulation confirm that our scheme is more functional, efficient and practical. Furthermore, verification and update phases do not require data owner’s private key, and any client owning public key and digest can verify the correctness of query and update database. Therefore, our scheme supports public verification and public update.
Secrecy Capacity Optimization Technique for Energy Harvest Cooperative Cognitive Radio Network
Wu Jiaxin, Wu Jigang, Chen Long, Sui Xiufeng
2019, 56(3):  533-543.  doi:10.7544/issn1000-1239.2019.20170838
Asbtract ( 159 )   HTML ( 3)   PDF (3187KB) ( 77 )  
Related Articles | Metrics
In order to maximize the primary secrecy rate, we researched the channel resource allocation of the primary user (PU) and the secondary user (SU) in the energy harvesting cognitive radio (EHCR) which is based on the cooperative communication and energy harvesting technology. In this network, the SU, which can harvest energy from the ambient, will assist the PU with cooperative communication. As a reward, the SU can get the opportunity to access the PU’s spectrum resource when the cooperative communication is over. Given the QoS demand of SU transmission node, we design a computing method of maximizing the primary secrecy rate. Numerical results demonstrate that the secrecy capacity is inverse proportional to the energy harvesting ratio, and it is also direct proportional to the cooperative communication ratio. We also figure out that the secrecy capacity is achievable. In the end, we assume that the SU works at a constant low sending power, when the cooperative communication ratio changes from 0.2 to 0.5, our research is lifting 79.28%, 80.46%, 64.23%, 78.85% on average, respectively, on the maximization of the secrecy rate, in comparison to the existing strategy. Moreover, the proposed algorithm is about 7.34 times efficient than the existing algorithms averagely.
A Blockchain Based Incentive Mechanism for Crowdsensing Applications
He Yunhua, Li Mengru, Li Hong, Sun Limin, Xiao Ke, Yang Chao
2019, 56(3):  544-554.  doi:10.7544/issn1000-1239.2019.20170670
Asbtract ( 568 )   HTML ( 16)   PDF (2442KB) ( 253 )  
Related Articles | Metrics
Crowdsensing applications collect large-scale sensing data by ubiquitous users carrying with smart devices. In crowdsensing applications, the quality of sensing data depends on the participation of high-skilled users, thus the users should be compensated for their resource consumption in the sensing task. Existing incentive mechanisms are difficult to meet the security requirements in the distributed environment of crowdsensing applications. For example, the reputation mechanism may suffer sybil attacks and whitewash attacks, which is unfair to honest users. The reciprocity mechanism is not flexible. The monetary scheme could make up the defects of the two preceding mechanisms, but it either relys on a central authority or does not give an explicit digital currency system which is provably secure, leading to possible system collapses or potential privacy disclosure caused by the ‘trusted’ center. In this paper, we propose a blockchain based incentive mechanism which uses a distributed architecture that is proved to be secure. In this distributed secure architecture, the participant users can be regarded as the nodes in a blockchain, and the payment transactions are recorded in the blockchain. The transactions will be verified by a majority of miners in the blockchain and they cannot be modified after being accepted by the miners. The incentive mechanism can prevent a part of participant users launching collusion attacks, and avoid the security threats brought by a trusted third party. Simulation experiments demonstrate the security strength and feasibility of the proposed incentive mechanism.
Multi-keyword Ranked Ciphertext Retrieval Scheme Based on Clustering Index
Du Ruizhong, Li Mingyue, Tian Junfeng
2019, 56(3):  555-565.  doi:10.7544/issn1000-1239.2019.20170830
Asbtract ( 180 )   HTML ( 3)   PDF (3891KB) ( 134 )  
Related Articles | Metrics
Data owners prefer to outsource documents in an encrypted form for the purpose of privacy preserving. But existed encrypting technologies make it difficult to search for encrypted data, which limit the availability of outsourced data. This will make it even more challenging to design ciphertext search schemes that can provide efficient and reliable online information retrieval. In order to improve the efficiency and precision of ciphertext retrieval, we propose a multi-keyword ciphertext retrieval scheme based on clustering index. Firstly, the improved Chameleon algorithm is used to cluster the file vectors during which the file vectors are dimensioned by recording the position of the key words. Secondly, a retrieval algorithm suitable for clustering index is proposed, which makes it possible to eliminate a large number of file vectors irrelevant to the query vector in the query process, and reduce unnecessary consumption. Finally, in the clustering process, Jaccard similarity coefficient is introduced to calculate the similarity between the file vectors and to set the appropriate threshold to improve the quality of the cluster. The theory analysis and experimental results show that the scheme can effectively improve the efficiency and precision of ciphertext retrieval under the premise of guaranteeing the privacy and security of data.
An Multi-Level Intrusion Detection Method Based on KNN Outlier Detection and Random Forests
Ren Jiadong, Liu Xinqian, Wang Qian, He Haitao, Zhao Xiaolin
2019, 56(3):  566-575.  doi:10.7544/issn1000-1239.2019.20180063
Asbtract ( 401 )   HTML ( 12)   PDF (2117KB) ( 200 )  
Related Articles | Metrics
Intrusion detection system can efficiently detect attack behaviors, which will do great damage for network security. Currently many intrusion detection systems have low detection rates in these abnormal behaviors Probe (probing), U2R (user to root) and R2L (remote to local). Focusing on this weakness, a new hybrid multi-level intrusion detection method is proposed to identify network data as normal or abnormal behaviors. This method contains KNN (K nearest neighbors) outlier detection algorithm and multi-level random forests (RF) model, called KNN-RF. Firstly KNN outlier detection algorithm is applied to detect and delete outliers in each category and get a small high-quality training dataset. Then according to the similarity of network traffic, a new method of the division of data categories is put forward and this division method can avoid the mutual interference of anomaly behaviors in the detection process, especially for the detecting of the attack behaviors of small traffic. Based on this division, a multi-level random forests model is constructed to detect network abnormal behaviors and improve the efficiency of detecting known and unknown attacks. The popular KDD (knowledge discovery and data mining) Cup 1999 dataset is used to evaluate the performance of the proposed method. Compared with other algorithms, the proposed method is significantly superior to other algorithms in accuracy and detection rate, and can detect Probe, U2R and R2L effectively.
Trajectory Privacy Protection Method Based on Multi-Anonymizer
Zhang Shaobo, Wang Guojun, Liu Qin, Liu Jianxun
2019, 56(3):  576-584.  doi:10.7544/issn1000-1239.2019.20180033
Asbtract ( 214 )   HTML ( 3)   PDF (2532KB) ( 143 )  
Related Articles | Metrics
At present, trajectory privacy protection in continuous location-based services has attracted wide attention. Some scholars have proposed some privacy-preserving methods, which mainly adopt the centralized structure based on the trusted third-party. However, there are privacy risks and performance bottlenecks in this structure. To overcome these defects, a trajectory privacy-preserving method based on multi-anonymizer (TPMA) is proposed by deploying multiple anonymizers between the user and the location service provider. In each query the user first selects a pseudonym, and the user’s query content is divided into n shares by the Shamir threshold scheme. Further, they are sent to n different anonymizers that randomly selected for processing, and one of the anonymizers is responsible for the user’s K-anonymity. In this method, the attacker cannot obtain the user’s trajectory and query content from a single anonymizer, and the anonymizer can be semi-trusted entity. The method can enhance the privacy of the user’s trajectory and can effectively solve the single point failure and the performance bottleneck in a single anonymizer structure. Security analysis shows that our approach can effectively protect the user’s trajectory privacy. Experiments show this method can reduce the computation and communication overhead of the single anonymizer compared with the trusted third party model.
Correction Vector Based Distributed DV-Hop Localization Refinement Algorithm
Lin Weiwei, Yao Yingbiao, Zou Ke, Feng Wei, Yan Junrong
2019, 56(3):  585-593.  doi:10.7544/issn1000-1239.2019.20170841
Asbtract ( 192 )   HTML ( 0)   PDF (2417KB) ( 84 )  
Related Articles | Metrics
Node location technology is one of the hot topics in current wireless sensor networks(WSNs). The DV-Hop (distance vector hop) localization algorithm, based on the hop distance estimation, is a typical representation of range-free localization algorithm. The advantages of DV-Hop is simple and easy implementation, and its disadvantage is low positioning accuracy which is resulting from the hop-distance ambiguity problem. Focusing on the hop-distance ambiguity problem of the traditional DV-Hop localization algorithm, this paper proposes a correction vector based distributed localization refinement algorithm (CVLR). Firstly, based on the localization results of DV-Hop, CVLR constructs the position correction vector using the pseudo ranging distance and the positioning distance between neighbors and unknown nodes. Secondly, the refinement process is modeled to minimize the square sum of the difference between the two distances in the direction of correction vector. Finally, a simple iterative search method is proposed to solve above minimization problem. In practice, CVLR consists of CVLR1 and CVLR2. CVLR1 can make full use of the information of 1-hop neighbors, and CVLR2 can make full use of the information of 1-hop and 2-hop neighbors. The simulation results show that, compared with DV-Hop, DV-RND (an improved DV-Hop localization algorithm based on regulated neighborhood distance), and DV-EA (an improved DV-Hop localization algorithm based on evolutionary algorithm), CVLR1 improves the positioning accuracy by about 30%, 25%, and 20%, and CVLR2 improves the positioning accuracy by about 45%, 42%, and 40%, on average.
Time Series Shapelets Extraction via Similarity Join
Zhang Zhenguo, Wang Chao, Wen Yanlong, Yuan Xiaojie
2019, 56(3):  594-610.  doi:10.7544/issn1000-1239.2019.20170741
Asbtract ( 294 )   HTML ( 0)   PDF (5226KB) ( 146 )  
Related Articles | Metrics
For time series classification, the classifier built by Shapelets has high classification accuracy and, meanwhile, the classification results are easily interpretable. Therefore, the extraction of discriminative Shapelets has attracted a lot of attention in the field of time series data mining. Research on Shapelets extraction has obtained promising achievement, but there are still some problems. The main reason is that the traversal of all time series subsequences to find the discriminative Shapelets is extraordinarily time consuming. Although some pruning techniques can be applied to accelerate the extraction process, they usually reduce the classification accuracy. In this paper, we propose a novel Shapelets extraction method based on similarity join, which abandons the idea of computing each subsequence’s discriminative power. In the proposed method, each subsequence is considered as a basic computing unit and the similarity vector of two time series is obtained by the similarity join calculation of their subsequences. For the time series with different class label, we compute the difference vector of each time series pair and merge them into a candidate matrix which represents the differences between different time series class. Thus, we can easily obtain the eligible Shapelets from the candidate matrix. Extensive experimental results in real time series datasets show that, compared with the exist Shapelets extraction methods, the proposed method has high time efficiency while ensuring excellent classification accuracy.
Mental Stress Assessment Approach Based on Smartphone Sensing Data
Wang Feng, Wang Yasha, Wang Jiangtao, Xiong Haoyi, Zhao Junfeng, Zhang Daqing
2019, 56(3):  611-622.  doi:10.7544/issn1000-1239.2019.20170809
Asbtract ( 257 )   HTML ( 2)   PDF (2333KB) ( 142 )  
Related Articles | Metrics
Mental stress is harmful on individuals’ physical and mental well-being. It is often easy to be overlooked in the early stage, leading to serious problems. Therefore, it is crucial to detect stress before it evolves into severe problems. Traditional stress detection methods are based on either questionnaires or professional devices, which are time-consuming, costly and intrusive. With the popularity of smartphones with various embedded sensors, which can capture users’ context data contains movement, sound, location and so on, it is an alternative way to access users’ behavior by smartphones, which is less intrusive. This paper proposes an automatic and non-intrusive stress detection approach based on mobile sensing data captured by smartphones. By extracting reasonable features from the perceived data, a more efficient psychological stress assessment method is proposed. First, we generate lots of features represent users’ behavior and explore the correlation between mobile sensing data and stress, then identify discriminative features. Second, we further develop a semi-supervised learning based stress detection model. Specifically, we use techniques such as co-training and random forest to deal with insufficient data. Finally, we evaluate our model based on the StudentLife dataset, and the experimental results verify the advantages of our approach over other baselines.
A Link Prediction Model Based on Hierarchical Information Granular Representation for Attributed Graphs
Luo Sheng, Miao Duoqian, Zhang Zhifei, Zhang Yuanjian, Hu Shengdan
2019, 56(3):  623-634.  doi:10.7544/issn1000-1239.2019.20170961
Asbtract ( 215 )   HTML ( 7)   PDF (2843KB) ( 99 )  
Related Articles | Metrics
With the accumulation of the network graph data coupled with node attributes, the relations between node attributes and node linkages become more and more complex, which brings a lot of challenges to the task of the link prediction in complex network. The main reason is the inconsistency existing in the different source data, that is, the relations between the latent linkages which are implied by the node attributes and the observed linkages from network topological structure, respectively. This phenomenon directly affects the correctness and accuracy of link predictions. In order to effectively deal with multi-source data inconsistency and fuse the heterogeneous data, with the idea of granular computing and data multi-layer granular representation, we model the original data at different levels of granular representation. According to the data granular representation, we ultimately eliminate data inherent inconsistencies by finding the optimal granular structure. In this paper, we firstly define the data granular representation and the relation between different level granular; Then, we construct a log-likelihood model of the data, and place a lot of constraints decided by the granular relations to regularize the model; At last, we use the trained model to perform the link probability between nodes. Experiments show that, multi-source data can ultimately reduce the inconsistency by granular representation, and the statistic model regulated by these granular relations outperforms the state-of-the-art methods, and effectively improves the accuracy of the link prediction in the attributed graph.
Multi-Model Data Fusion Based Unobtrusive Identification Method
Yu Diancun, Chen Yiqiang, Peng Xiaohui, Jiao Shuai, Li Xiaohai, Zhong Xi
2019, 56(3):  635-642.  doi:10.7544/issn1000-1239.2019.20170807
Asbtract ( 164 )   HTML ( 3)   PDF (1983KB) ( 90 )  
Related Articles | Metrics
The traditional gait recognition technology in the field of intelligent terminal equipment risk control still has some problems. The existing program is using accelerometer, gyroscope and other multi-sensor gait for identification and verification. Due to the existing identification methods set a number of restrictions, hinder the use and promotion of this technology. For example: the sensor device needs to be fixed at the same position as the ankle, knee, waist and so on; the device has a designated orientation; the user does a specific action. In addition, the application of the technology of identity verification and verification through gait to the field of risk control requires a complete and reliable system architecture. There is still a big problem with the existing architecture. Therefore, this paper presents a non-interference and location-independent identification and verification method that uses only accelerometers and builds a complete set of system implementation architecture with this method as the core. The implementation of this architecture method has improved the overall system accuracy and availability. Firstly, the user’s behavior and the location of the device are predicted; then the gait analysis and identification are carried out. In this experiment, we only use the built-in accelerometer in the smart phone to collect data, finally position-independent gait analysis and identification to identify the user to determine which is the most important, so as to reduce the risk of using smart phones and improve the safety factor. The experimental results show that the system architecture designed in this paper is conducive to the improvement of overall system accuracy. The method has the characteristics of high recognition rate and very low FPR (false positive rate), and improves the APP and the smartphone in the case of non-interfering users such as intelligent terminal equipment security.
An Adaptive Algorithm in Multi-Armed Bandit Problem
Zhang Xiaofang, Zhou Qian, Liang Bin, Xu Jin
2019, 56(3):  643-654.  doi:10.7544/issn1000-1239.2019.20180019
Asbtract ( 216 )   HTML ( 2)   PDF (1951KB) ( 157 )  
Related Articles | Metrics
As an important ongoing field in machine learning, reinforcement learning has received extensive attention in recent years. The multi-armed bandit (MAB) problem is a typical problem of the exploration and exploitation dilemma in reinforcement learning. As a classical MAB problem, the stochastic multi-armed bandit (SMAB) problem is the base of many new MAB problems. To solve the problems of insufficient use of information and poor generalization ability in existing MAB methods, this paper presents an adaptive SMAB algorithm to balance exploration and exploitation based on the chosen number of arm with minimal estimation, namely CNAME in short. CNAME makes use of the chosen times and the estimations of an action at the same time, so that an action is chosen according to the exploration probability, which is updated adaptively. In order to control the decline rate of exploration probability, the parameter w is introduced to adjust the influence degree of feedback during the selection process. Furthermore, CNAME does not depend on contextual information, hence it has better generalization ability. The upper bound of CNAMEs regret is theoretically proved and analyzed. Our experimental results in different scenarios show that CNAME can yield greater reward and smaller regret with high efficiency than commonly used methods. In addition, its generalization ability is very strong.
Spanner Algorithm for Directed Graph Stream
Zhang Xin, Li Xiaoguang
2019, 56(3):  655-665.  doi:10.7544/issn1000-1239.2019.20170680
Asbtract ( 195 )   HTML ( 1)   PDF (1978KB) ( 108 )  
Related Articles | Metrics
With the rapid growth of analysis demand in many fields such as social network, transportation network, and bioinformatics network, the processing of large-scale graph data has become a new challenge of information technology. Graph spanner is a subgraph in which the distance between every pair of vertices falling into the given range defined by stretch factor. With the spanner, the storage and computational cost are reduced greatly for large-scale graph data processing. Most of existing research work focused on the spanner on undirected graph, and we propose an effective algorithm for directed graph spanner. Firstly, we re-define the concept of cluster set and three kinds of crucial edges including cluster edge, bridge edge and free edge, and analyze theoretically the correctness of (3,2)-spanner construction based on the crucial edges. Moreover, we propose a spanner algorithm for graph data in streaming model in which the edges is clustered and updated according to the type of end-point of edge, and then a (3,2)-spanner of full graph is constructed. Finally, we provide the theoretical analysis of space complexity O(n\+2/4), and present the experiments on the synthetic graph of power law model. Experiment results show our algorithm meet the definition of (3,2)-spanner, and has low time and space overhead.
A Spatio-Temporal Index Based on Skew Spatial Coding and R-Tree
Zhao Xinyi, Huang Xiangdong, Qiao Jialin, Kang Rong, Li Na, Wang Jianmin
2019, 56(3):  666-676.  doi:10.7544/issn1000-1239.2019.20170750
Asbtract ( 245 )   HTML ( 5)   PDF (2938KB) ( 121 )  
Related Articles | Metrics
With the development of mobile Internet and Internet of Things, more and more mobile devices such as vehicles and mobile phones are embedded with GPS services. These devices generate a lot of spatio-temporal data with multi-dimensional attributes such as time, space longitude and latitude when moving. Traditional spatio-temporal indexes have neither the ability to handle large amounts of data, nor the ability to handle data with both time and space dimensions, nor the ability to support many kinds of spatio-temporal query. This paper illustrates GRIST (Geohash and R-Tree based index for spatio-temporal data), an innovative two-level spatio-temporal index based on Geohash and R-Tree. The first layer is a spatial index based on grid partition, which partitions the space into non-overlapping subspaces with different size. Each grid is encoded by Geohash. The second layer is the time layer index based on the R-Tree, where each spatial grid corresponds to a one-dimensional R-Tree index. Also, this paper supports two types of queries, one is time-based query, the other is spatio-temporal-based query which can be further subdivided into range query and trajectory query. In addition, this paper realizes GRIST by NoSQL K-V database Cassandra and uses Spark to realizes distributed query. Compared with PostGIS and GeoMesa, GRIST has better load and query performance, 10~45 times better in load time and 2~4 times better in query time.