ISSN 1000-1239 CN 11-1777/TP

Table of Content

01 June 2019, Volume 56 Issue 6
Brain-like Machine: Thought and Architecture
Huang Tiejun, Yu Zhaofei, Liu Yijun
2019, 56(6):  1135-1148.  doi:10.7544/issn1000-1239.2019.20190240
Asbtract ( 1163 )   HTML ( 46)   PDF (3343KB) ( 1124 )  
Related Articles | Metrics
The theoretical limitation of the classical computing machinery, including all the computers with von Neumann architecture, was defined by Alan Turing in 1936. Owing to lack of the hardware neuromorphic devices, neural networks have been implemented with computers to realize artificial intelligence for decades. However, the von Neumann architecture doesn’t match with the asynchronous parallel structure and communication mechanism of the neural networks, with consequences such as huge power consumption. To develop the neural network oriented architecture for artificial intelligence and common information processing is an important direction for architecture research. Brain-like machine is an intelligent machine which is constructed with neuromorphic devices according to the structure of biological neural network, and is better on spatio-temporal information processing than classic computer. The idea of brain-like machine had been proposed before the invention of computer. The research and development practice has been carried out for more than three decades. As one of the several brain-like systems being in operation, SpiNNaker focuses on the research on the architecture of brain-like systems with an effective brain-like scheme. In the next 20 years or so, it is expected that the detailed analysis of model animal brain and human brain will be completed step by step, and the neuromorphic devices and integrated processes will be gradually mature, and the brain-like machine with structure close to the brain and performance far beyond the brain is expected to be realized. As a kind of spiking neural networks, and with neuromorphic devices which behavior is true random, the brain-like machine can emerge abundant nonlinear dynamic behaviors. It had been proven that any Turing machine can be constructed with spiking neural network. Whether the brain-like machine can transcend the theoretical limitation of the Turing machine? This is a big open problem to break through.
3D Memristor Array Based Neural Network Processing in Memory Architecture
Mao Haiyu, Shu Jiwu
2019, 56(6):  1149-1160.  doi:10.7544/issn1000-1239.2019.20190099
Asbtract ( 776 )   HTML ( 4)   PDF (2125KB) ( 603 )  
Related Articles | Metrics
Nowadays, due to the rapid development of artificial intelligence, the memristor-based processing in memory (PIM) architecture for neural network (NN) attracts a lot of researchers’ interests since it performs much better than traditional von Neumann architecture. Equipped with the peripheral circuit to support function units, memristor arrays can process a forward propagation with higher parallelism and much less data movement than that in CPU and GPU. However, the hardware of the memristor-based PIM suffers from the large area overhead of peripheral circuit outside the memristor array and non-trivial under-utilization of function units. This paper proposes a 3D memristor array based PIM architecture for NNs (FMC) by gathering the peripheral circuit of function units into a function pool for sharing among memristor arrays that pile up on the pool. We also propose a data mapping scheme for the 3D memristor array based PIM architecture to further increase the utilization of function units and reduce the data transmission among different cubes. The software-hardware co-design for the 3D memristor array based PIM not only makes the most of function units but also shortens the wire interconnections for better high-performance and energy-efficient data transmission. Experiments show that when training a single neural network, our proposed FMC can achieve up to 43.33 times utilization of the function units and can achieve up to 58.51 times utilization of the function units when training multiple neural networks. At the same time, compared with the 2D-PIM which has the same amount of compute array and storage array, FMC only occupies 42.89% area of 2D-PIM. What’s more, FMC has 1.5 times speedup and 1.7 times energy saving compared with 2D-PIM.
A Secure Encryption Scheme for Deep Learning Accelerators
Zuo Pengfei, Hua Yu, Xie Xinfeng, Hu Xing, Xie Yuan, Feng Dan
2019, 56(6):  1161-1169.  doi:10.7544/issn1000-1239.2019.20190109
Asbtract ( 873 )   HTML ( 23)   PDF (1368KB) ( 580 )  
Related Articles | Metrics
With the rapid development of machine learning techniques, especially deep learning (DL), their application domains are wider and wider and increasingly expanded from cloud computing to edge computing. In deep learning, DL models as the intellectual property (IP) of model providers become important data. We observe that DL accelerators deployed on edge devices for edge computing have the risk of leaking DL models stored on them. Attackers are able to easily obtain the DL model data by snooping the memory bus connecting the on-chip accelerator and off-chip device memory. Therefore, encrypting data transmitted on the memory bus is non-trivial. However, directly using memory encryption in DL accelerators significantly decreases their performance. To address this problem, this paper proposes COSA, a COunter mode Secure deep learning Accelerator architecture. COSA achieves higher security level than direct encryption and removes decryption operations from the critical path of memory accesses by leveraging counter mode encryption. We have implemented COSA in GPGPU-Sim and evaluated it using the neural network workload. Experimental results show COSA improves the performance of the secure accelerator by over 3 times compared with direct encryption and causes only 13% performance decrease compared with an insecure accelerator without using encryption.
Modeling Computational Feature of Multi-Layer Neural Network
Fang Rongqiang, Wang Jing, Yao Zhicheng, Liu Chang, Zhang Weigong
2019, 56(6):  1170-1181.  doi:10.7544/issn1000-1239.2019.20190111
Asbtract ( 711 )   HTML ( 10)   PDF (2584KB) ( 372 )  
Related Articles | Metrics
Deep neural networks (DNNs) have become increasingly popular as machine learning technique in applications, due to their ability to achieve high accuracy for tasks such as speech/image recognition. However, with the rapid growth on the scale of data and precision of recognition, the topology of neural network is becoming more and more complicated. Thus, how to design the energy-efficiency and programmability, neural or deep learning accelerator plays an essential role in next generation computer. In this paper, we propose a layer granularity analysis method, which could extract computation operations and memory requirement features through general expression and basic operation attributions. We also propose a max value replacement schedule strategy, which schedules the computation hardware resource based on the network feature we extract. Evaluation results show our method can increase computational efficiency and lead to a higher resource utilization.
Training and Software Simulation for ReRAM-Based LSTM Neural Network Acceleration
Liu He, Ji Yu, Han Jianhui, Zhang Youhui, Zheng Weimin
2019, 56(6):  1182-1191.  doi:10.7544/issn1000-1239.2019.20190113
Asbtract ( 625 )   HTML ( 4)   PDF (1075KB) ( 608 )  
Related Articles | Metrics
Long short-term memory (LSTM) is mostly used in fields of speech recognition, machine translation, etc., owing to its expertise in processing and predicting events with long intervals and long delays in time series. However, most of existing neural network acceleration chips cannot perform LSTM computation efficiently, as limited by the low memory bandwidth. ReRAM-based crossbars, on the other hand, can process matrix-vector multiplication efficiently due to its characteristic of processing in memory (PIM). However, a software tool of broad architectural exploration and end-to-end evaluation for ReRAM-based LSTM acceleration is still missing. This paper proposes a simulator for ReRAM-based LSTM neural network acceleration and a corresponding training algorithm. Main features (including imperfections) of ReRAM devices and circuits are reflected by the highly configurable tools, and the core computation of simulation can be accelerated by general-purpose graphics processing unit (GPGPU). Moreover, the core component of simulator has been verified by the corresponding circuit simulation of a real chip design. Within this framework, architectural exploration and comprehensive end-to-end evaluation can be achieved.
Accelerating Fully Connected Layers of Sparse Neural Networks with Fine-Grained Dataflow Architectures
Xiang Taoran, Ye Xiaochun, Li Wenming, Feng Yujing, Tan Xu,Zhang Hao, Fan Dongrui
2019, 56(6):  1192-1204.  doi:10.7544/issn1000-1239.2019.20190117
Asbtract ( 819 )   HTML ( 17)   PDF (2435KB) ( 599 )  
Related Articles | Metrics
Deep neural network (DNN) is a hot and state-of-the-art algorithm which is widely used in applications such as face recognition, intelligent monitoring, image recognition and text recognition. Because of its high computational complexity, many efficient hardware accelerators have been proposed to exploit high degree of parallel processing for DNN. However, the fully connected layers in DNN have a large number of weight parameters, which imposes high requirements on the bandwidth of the accelerator. In order to reduce the bandwidth pressure of the accelerator, some DNN compression algorithms are proposed. But accelerators which are implemented on FPGAs and ASICs usually sacrifice generality for higher performance and lower power consumption, making it difficult to accelerate sparse neural networks. Other accelerators, such as GPUs, are general enough, but they lead to higher power consumption. Fine-grained dataflow architectures, which break conventional Von Neumann architectures, show natural advantages in processing DNN-like algorithms with high computational efficiency and low power consumption. At the same time, it remains broadly applicable and adaptable. In this paper, we propose a scheme to accelerate the sparse DNN fully connected layers on a hardware accelerator based on fine-grained dataflow architecture. Compared with the original dense fully connected layers, the scheme reduces the peak bandwidth requirement of 2.44×~ 6.17×. In addition, the utilization of the computational resource of the fine-grained dataflow accelerator running the sparse fully-connected layers far exceeds the implementation by other hardware platforms, which is 43.15%, 34.57%, and 44.24% higher than the CPU, GPU, and mGPU, respectively.
A Propagation Mechanism Combining an Optimal Propagation Path and Incentive in Blockchain Networks
Hai Mo, Zhu Jianming
2019, 56(6):  1205-1218.  doi:10.7544/issn1000-1239.2019.20180419
Asbtract ( 975 )   HTML ( 10)   PDF (1571KB) ( 497 )  
Related Articles | Metrics
For the blockchain fork in blockchain networks can cause the attacker to perform double-spending attack very easily, how to reduce the fork probability is a very important and challenging research issue. Aiming at three problems of current research on optimizing the propagation mechanism of transactions and blocks in blockchain networks to reduce the fork probability: only the propagation delay between adjacent nodes or the total number of routing hops of the propagation process is reduced; the propagation process generates a large number of communication messages; it is based on the assumption that nodes on the propagation path will continue to propagate transactions and blocks, and a propagation mechanism combining an optimal propagation path and incentive (OPPI) in blockchain networks, is proposed to decrease both the total propagation delay and the number of communication messages, which achieves a tradeoff between the propagation efficiency and the propagation cost. Simulation results show that: compared with the existing propagation mechanism of blockchain networks based on Gossip, when the network topology is random graph, scale-free graph, small-world network graph, the number of nodes is 10, 100, 1 000, 10 000 and the degree k is set to 2, 4, 8 respectively, OPPI reduces both the total propagation delay and the number of communication messages generated by the propagation process significantly, specifically, by 99.4% to 99.98% in the total propagation delay and by 99% to 99.1% in the number of communication messages.
Multi-Constrained Energy-Saving Routing Algorithm in Software-Defined Data Center Networks
He Rongxi, Lei Tianying, Lin Ziwei
2019, 56(6):  1219-1230.  doi:10.7544/issn1000-1239.2019.20180029
Asbtract ( 356 )   HTML ( 5)   PDF (3038KB) ( 328 )  
Related Articles | Metrics
The proposed energy-saving routing algorithms for data center networks can be broadly classified into two categories: traffic-awareness and topology-awareness. The performance of the former depends largely on the accuracy of the predicted traffic patterns. However, the traffic patterns in the actual network change dynamically, and the predicted traffic patterns might not accord with the real-time traffic patterns of the network. It is often difficult for the traffic-aware energy-saving routing algorithms to guarantee reliable transmission of burst flows. On the other hand, the latter has to ensure the connectivity of the network topology, which does not consider the traffic load. Therefore, there are often lots of idle devices during low traffic load that result in energy waste. For this reason, for a software-defined data center network (SDCN) with fat-tree topology, we combine the advantages of the two kinds of energy-saving routing algorithms, and propose some concepts including equivalent nodes, the minimum subset for network connectivity, isolated switch and invalid link, auxiliary graph model and SDCN connectivity conditions in this paper. On this basis, a multi-constrained energy-saving routing optimization model is described. Moreover, a multi-constrained energy-saving routing algorithm (MER) is also proposed. In order to reduce the network energy consumption, MER algorithm can sleep as many redundant switches and links as possible while guaranteeing the requirements of the flows in delay and reliability. Finally, the performance of the proposed algorithm is evaluated by using Mininet and Floodlight, and the simulation results show that MER has lower average packet delay and packet loss rate than the existing algorithms in the literature with a desired energy saving rate.
Node Location Verification Framework for WSN
Miao Chunyu, Chen Lina, Wu Jianjun, Zhou Jiaqing, Feng Xuhang
2019, 56(6):  1231-1243.  doi:10.7544/issn1000-1239.2019.20170660
Asbtract ( 472 )   HTML ( 9)   PDF (3765KB) ( 333 )  
Related Articles | Metrics
Localization is one of the pivot technologies in wireless sensor networks. The traditional node localization schemes consider that the locations of anchors are reliable, which makes these schemes are invalid in some scenarios with unreliable anchors such as drifted anchors, fake anchors and malicious anchors. Aiming at solving this problem mentioned above, a distributed and lightweight node location verification framework (NLVF) is proposed. NLVF offers location verification service as an underlying technic for the traditional localization algorithms, including range-based localization algorithm and the range-free localization algorithm. NLVF can filter out these unreliable anchors by which the application area of traditional localization algorithms is enlarged. UNDA (unreliable node detection algorithm) is the key algorithm of NLVF. It constructs location reputation model based on mutual distance observation between neighbors in WSN. UNDA algorithm improves the localization reliability by filtering out these anchors with inferior location reputations. Extensive experiments are conducted to evaluate the performance of UNDA. Results show that NLVF is adapted to both of range-based and range-free localization schemes. It works better in the presence of three kinds of unreliable anchors. So, it yields general applicability. In addition, UNDA relatively has high accuracy, and the average success rate of detection is more than 95%, so NLVF yields significant practicability.
Personalized Privacy Preserving Link Prediction in Social Networks
Meng Xuying, Zhang Qijia, Zhang Hanwen, Zhang Yujun, Zhao Qinglin
2019, 56(6):  1244-1251.  doi:10.7544/issn1000-1239.2019.20180306
Asbtract ( 628 )   HTML ( 6)   PDF (786KB) ( 352 )  
Related Articles | Metrics
Link prediction is widely used to predict and recommend social relationships in social networks. However, it requires users’ personal information, leading to great risks to users’ privacy. To prevent privacy leakage, users may refuse to provide needed information to the service provider, which in turn brings in decreases on the effectiveness of link prediction, and further hurts user experience. To eliminate the concerns of privacy disclosure and encourage users to provide more data for link prediction, we propose personalized privacy preserving link prediction in social network. We get rid of the full dependence on the service provider and friends by making users and the service provider cooperate to complete the process of link prediction. Also, we attach different magnitude noise with personalized privacy settings, maintaining the effectiveness of link prediction while protecting sensitive links and sensitive attributes. Finally, theoretical analysis is provided based on differential privacy, and experimental results on real world datasets show that our proposed methods can provide better privacy protection while maintaining the effectiveness of link prediction.
An Automatic Detection Method for Privacy Leakage Across Application Components
Li Zhen, Tang Zhanyong, Li Zhengqiao, Wang Hai, Gong Xiaoqing, Chen Feng, Chen Xiaojiang, Fang Dingyi
2019, 56(6):  1252-1262.  doi:10.7544/issn1000-1239.2019.20180548
Asbtract ( 550 )   HTML ( 6)   PDF (1171KB) ( 310 )  
Related Articles | Metrics
In recent years, Android operating system has developed rapidly. A large number of mobile users use Android smart devices as tools for personal communication and work. The privacy information of Android mobile users has become one of the main targets of black industry practitioners. Existing privacy detection research mainly focuses on addressing privacy leakage risk within Android applications, including the detection of privacy leakage within program components, the detection of privacy leakage between components, and the detection of ICC vulnerability. Actually, the behavior of sharing users’ privacy through collaboration among application components is widespread, which causes a large number of users’ privacy information to be leaked. How to effectively detect and prevent privacy leakage between application components is an urgent problem. However, the number of components in Android applications is huge and there are plenty of components unrelated to privacy leaks between applications. Therefore, how to detect possible privacy leaks between applications meets a serious challenge. Aiming at this problem, this paper presents a method to construct a component sequence with potential privacy leaks, and the method uses data flow analysis technology to realize a detection system for privacy leakage between application components, named PLDetect. PLDetect solves the problem of incomplete coverage of code and lagging detection results in the existing technology. Finally, based on the privacy leak path, PLDetect utilizes an encryption-based privacy leak protection method to encrypt privacy information, ensuring that information is effectively prevented from being maliciously transmitted without affecting application runtime performance. The final experiment shows that PLDetect detects 5 groups of applications with privacy leaks across application components in 81 applications and effectively blocks privacy data leaks.
Detecting Malicious Domains Using Co-Occurrence Relation Between DNS Query
Peng Chengwei, Yun Xiaochun, Zhang Yongzheng, Li Shuhao
2019, 56(6):  1263-1274.  doi:10.7544/issn1000-1239.2019.20180481
Asbtract ( 799 )   HTML ( 27)   PDF (1474KB) ( 586 )  
Related Articles | Metrics
Malicious domains play a vital role in illicit online activities. Effectively detecting the malicious domains can significantly decrease the damage of evil attacks. In this paper, we propose CoDetector, a novel technique to detect malicious domains based on the co-occurrence relationships of domains in DNS (domain name system) queries. We observe that DNS queries are not isolated, whereas co-occur with each other. We base it design on the intuition that domains that tend to co-occur in DNS traffic are strongly associated and are likely to be in the same property (i.e., malicious or benign). Therefore, we first perform coarse-grained clustering of DNS traffic based on the chronological order of DNS queries. The domains co-occurring with each other will be clustered. Then, we design a mapping function that automatically projects every domain into a low-dimensional feature vector while maintaining their co-occurrence relationships. Domains that co-occur with each others are mapped to similar vectors while domains that not co-occur are mapped to distant vectors. Finally, based on the learned feature representations, we train a classifier over a labeled dataset and further apply it to detect unknown malicious domains. We evaluate CoDetector using real-world DNS traffic collected from an enterprise network over two months. The experimental results show that CoDetector can effectively detect malicious domains (91.64% precision and 96.04% recall).
Security Analysis of Authentication Protocol of WMN Client and LTCA Based on Logic of Events
Xiao Meihua, Li Yanan, Song Jiawen, Wang Xizhong, Li Wei, Zhong Xiaomei
2019, 56(6):  1275-1289.  doi:10.7544/issn1000-1239.2019.20180466
Asbtract ( 312 )   HTML ( 2)   PDF (1140KB) ( 208 )  
Related Articles | Metrics
Wireless mesh network is a new type of broadband wireless network structure, which combines the advantages of wireless local area network and ad-hoc network. The research on wireless mesh network is one of the emerging research focuses about wireless networks. Based on the logic of events, the substitution rule is proposed to ensure the equivalent conversion of user interaction information in the process of property substitution by combining event structures, event classes, axiom clusters and random number lemma. With the basic sequences of authentication protocol between client and LTCA constructed by logic of events, the protocol actions between client and LTCA are formally described, and strong authentication property of the protocol is proved. Under reasonable assumptions, the security property of the authentication protocol between WMN client and LTCA is verified, and the research shows that both the security attributes of wireless network protocols and the authentication property between different principals of cryptographic protocols can be proved by logic of events. By simplifying the formal proof steps with flow chart, the process of logic of events proving protocol’s security property is described, similarly, by comparing and analyzing logic of events with other logical reasoning methods, the universal applicability of logic of events is shown.
Attribute-Based Encryption for Data Security Sharing of Internet of Things
Zhao Zhiyuan, Wang Jianhua, Zhu Zhiqiang, Sun Lei
2019, 56(6):  1290-1301.  doi:10.7544/issn1000-1239.2019.20180288
Asbtract ( 714 )   HTML ( 13)   PDF (2111KB) ( 413 )  
Related Articles | Metrics
The development of Internet of things (IoT) has always been faced with serious security threats and challenges. The security sharing and fine-grained access control of data in the IoT is one of the security issues that urgently need to deal with. In order to solve this problem, an attribute-based encryption (ABE) scheme with the hidden access structure for data security sharing of IoT is proposed. This scheme can achieve fine-grained access control of ciphertext and guarantee data privacy. In this paper, a universal method to convert identity-based encryption (IBE) into ciphertext-policy attribute-based encryption (CP-ABE) is proposed, which supports AND-gate access structure with multiple values. The converted CP-ABE can inherit the characteristics of IBE. Then, the receiver anonymous IBE scheme proposed by Wee is converted to the CP-ABE scheme with the hidden access structure based on the conversion method, which realizes the fixed length of ciphertext, user secret key, public key and master secret key, and only needs one bilinear pairing computation in the decryption phase. The converted scheme is applied to the intelligent medical application scene and the system model and application steps are given. Finally, the results of theoretical analysis and experimental simulation show that the proposed scheme implements the hidden access structure and has advantages in computing efficiency, storage burden and security. It is more efficient and secure when the scheme is applied to the IoT environment.
An Advertising Game Theory Decision-Making Mechanism for Overlapping Seeds in Geo-Social Network
Yu Yaxin, Wang Lei
2019, 56(6):  1302-1311.  doi:10.7544/issn1000-1239.2019.20180068
Asbtract ( 397 )   HTML ( 5)   PDF (2089KB) ( 208 )  
Related Articles | Metrics
Social advertising (or social promotion), one of the most important application for influence maximization (IM), has become a hot industry. The purpose of social advertising is to identify a seed set of k nodes for maximizing the expected influence spread to make businesses promote products by utilizing the cascade effect. However, most existing works concerning influence maximization problem were confined to behaviors that were observed mostly in the virtual world and neglected the location information. In fact, the distance between users can also affect propagation probability during information propagation. Thus, a problem of location-aware influence maximization (LAIM) in geo-social network is formally defined in this paper. Further, a location-aware influence maximization algorithm based on greedy frame is proposed, which introduces the promotion location information into the existing IM definition and solves the problem that influence spread is not in line with actual demand in traditional IM. Due to the overlapping seeds problem caused by inevitable competition, some of the selected nodes may not work well in practice. Thus we first conduct a decision-making game from overlapping seed’s perspective to obtain Nash equilibrium by analyzing the payoff matrix of overlapping seeds and their neighbors so that the overlapping ratio of seed set and influence loss will be reduced. Finally, comprehensive experiments demonstrate the effectiveness of our algorithm and strategy.
A Hierarchical Deep Correlative Fusion Network for Sentiment Classification in Social Media
Cai Guoyong, Lü Guangrui, Xu Zhi
2019, 56(6):  1312-1324.  doi:10.7544/issn1000-1239.2019.20180341
Asbtract ( 529 )   HTML ( 12)   PDF (1956KB) ( 396 )  
Related Articles | Metrics
Most existing research of sentiment analysis are based on either textual or visual data and can not achieve satisfied results. As multi-modal data can provide richer information, multi-modal sentiment analysis is attracting more and more attentions and has become a hot research topic. Due to the strong semantic correlation between visual data and the co-occurrence textual data in social media, mixed data of texts and images provides a new view to learn better classifier for social media sentiment classification. A hierarchical deep correlative fusion network framework is proposed to jointly learn textual and visual sentiment representations from training samples for sentiment classification. In order to alleviate the problem of fine-grained semantic matching between image and text, both the middle level semantic features of images and the deep multi-modal discriminative correlation analysis are applied to learn the most relevant visual feature representation and semantic feature representation, meanwhile, keeping both the visual and semantic feature representations to be linear discriminable. Motivated by the successful use of attention mechanisms, we further propose a multi-modal attention fusion network by incorporating visual and semantic feature representations to train sentiment classifier. Experiments on the real-world datasets which come from social networks show that, the proposed method gets more accurate prediction on multi-media sentiment analysis by capturing the internal relations between text and image hierarchically.
A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data
Li Shunyong, Zhang Miaomiao, Cao Fuyuan
2019, 56(6):  1325-1337.  doi:10.7544/issn1000-1239.2019.20180737
Asbtract ( 367 )   HTML ( 3)   PDF (2244KB) ( 275 )  
Related Articles | Metrics
Traditional algorithms generally cluster single-valued attributed data. However, in practice, each attribute of the data object is described by more than one feature vector. For example, customers may purchase multiple products at the same time as they shop. An object described by multiple feature vectors is called a matrix object and such data are called matrix-object data. At present, the research work on clustering algorithms for categorical matrix- object data is relatively rare, and there are still many issues to be settled. In this paper, we propose a new matrix-object data fuzzy k-modes (MD fuzzy k-modes) algorithm that uses the fuzzy k-modes clustering process to cluster categorical matrix-object data. In the proposed algorithm, we introduce the fuzzy factor β with the concept of fuzzy set. The dissimilarity measure between two categorical matrix-objects is redefined, and the heuristic updating algorithm of the cluster centers is provided. Finally, the effectiveness of the MD fuzzy k-modes algorithm is verified on the five real-world data sets, and the relationship between fuzzy factor β and membership w is analyzed. Therefore, in the era of big data, clustering multiple records by using the MD fuzzy k-modes algorithm can make it easier to find customers’ spending habits and preferences, so as to make more targeted recommendation.
Progress and Challenges of Graph Summarization Techniques
Wang Xiong, Dong Yihong, Shi Weijie, Pan Jianfei
2019, 56(6):  1338-1355.  doi:10.7544/issn1000-1239.2019.20180371
Asbtract ( 726 )   HTML ( 11)   PDF (2953KB) ( 547 )  
Related Articles | Metrics
Graph summarization aims to search a group of simple hypergraphs or sparse graphs, which illustrate the main structural information or change trend of the original graph. Based on the application field and background of original graph, different graph summarization techniques are used to construct a specific summary graph, which can solve the problems of information overload, query optimization, spatial compression, impact analysis, social network visualization and so on. According to the classification criteria of the main purpose of the summary, the existing graph summarization techniques are divided into four categories: the graph summarization based on spatial compression, the graph summarization based on query optimization, the graph summarization based on pattern visualization and the graph summarization based on impact analysis. The partial graph summarization algorithms of non-attribute graphs and attribute graphs are tested on real data sets to analyze the indexes of information retention rate, compression rate, information entropy and running time experimentally. At last, not only the development trends of the graph summarization are highlighted, but also the challenges and the future research directions that can be explored in depth are pointed out. Combining with the popular deep learning technology, some valuable and potential Macro coutermeasures are put forward to solve these challenges.