• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search

2022  Vol. 59  No. 7

Abstract:
Quantization can compress convolutional neural network (CNN) model size and improve computing efficiency. However, the existing accelerator designs for CNN quantization are usually faced with the challenges of various algorithms, poor reusability of code modules, low efficiency of data exchange and insufficient utilization of resources, and so on. To meet these challenges, we propose a flexible acceleration framework for the quantized CNNs named FAQ-CNN to optimize accelerator design from three aspects of computing, communication and storage. FAQ-CNN can support rapid deployment of quantized CNN model in the form of software tools. Firstly, a component for quantization algorithms is designed to separate the calculation part from the process of value projection in quantization algorithm; the optimization techniques such as operator fusion, double buffering and pipeline are also utilized to improve the execution efficiency of CNN inference task in parallel. Then, the hierarchical and bitwidth-independent encoding and parallel decoding method are both proposed to efficiently support batch transmission and parallel computing for low bitwidth data. Finally, the resource allocation optimization model which can be transformed into an integer nonlinear programming problem is established for FAQ-CNN; the heuristic pruning strategy is used to reduce design space size. The extensive experimental results show that FAQ-CNN can support almost all kinds of quantized CNN accelerators efficiently and flexibly. When the activation and weight value are set to 16 b, the computing performance of FAQ-CNN accelerator is 1.4 times that of the Caffeine. When 8 b configuration is applied, FAQ-CNN can achieve the superior performance by 1.23TOPS.
Abstract:
Field programmable gate array (FPGA) is extremely susceptible to failures caused by high-energy particle radiation in space, thereby affecting the normal execution of on-chip tasks. At present, the triple modular redundance (TMR) method is usually used for fault-tolerant design. Although well fault-tolerant effect can be achieved, a large amount of resource expenditure is required. Especially when the radiation level is low, the implementation of TMR method for all tasks can aggravate the above problem of high resource overhead. In view of this, a method of FPGA fault tolerance based on dynamic self-adaptive redundancy is proposed. First of all, using the high sensitivity of on-chip block RAM (BRAM) to space particle radiation, the BRAM-based radiation level monitor is designed and improved to periodically monitor the radiation level of the space environment. Secondly, slack time of execution cycle and current radiation level are standard for evaluating the reliability levels of tasks, and then a task is used as a granular for dynamic self-adaptive matching redundancy strategy under different radiation levels to ensure the successful execution of on-chip tasks while avoiding high resource overhead. Simulation results show that the FPGA with this method has high reliability under different radiation levels. Compared with the popular FPGA fault tolerance method based on redundancy, the on-chip task completion is increased by 57.2% on average under the same radiation level.
Abstract:
Federated learning deploys deep learning training tasks on mobile edge networks. Mobile devices participating in learning only need to send the trained local models to the server instead of sending personal data, thereby protecting the data privacy of users. To speed up the implementation of federated learning, optimization of efficiency is the key. The main factors affecting efficiency include communication consumption between device and server, model convergence rate, and security and privacy risk of mobile edge networks. Based on thoroughly investigating the existing optimization methods, we summarize the efficiency optimization of federated learning into communication optimization, training optimization, and protection mechanism for the first time. Specifically, we discuss the optimization of federated learning communication from two aspects of edge computing coordination and model compression which can reduce the frequency of communication and resource consumption. Then, we review the optimization of federated learning process from four elements of device selection, resource coordination, model aggregation control, and data optimization similarly, because there are many heterogeneous factors in the mobile edge networks, such as the different computing resources of mobile devices and different data quality. Furthermore, the security and privacy protection mechanisms of federated learning are expounded. After comparing the innovation points and contributions of related technologies, the advantages and disadvantages of the existing solutions are concluded and the new challenges faced by federated learning are discussed. Finally, we propose edge-intelligent federated learning based on the idea of edge computing, provide innovative methods and future research directions in data optimization, adaptive learning, incentive mechanisms, and advanced technology.
Abstract:
Graph neural networks can effectively learn network semantic information and have achieved good performance on node classification tasks, but still facing challenge: how to make the best of rich heterogeneous semantic information and comprehensive structural information to make node classification more accurate. To resolve the above challenge, based on the graph convolution operation, HNNCF (heterogeneous network node classification framework) is proposed to solve the node classification task in heterogeneous networks, including two steps of heterogeneous network reduction and graph convolution node classification. Firstly, through the designed heterogeneous network reduction rules, HNNCF simplifies a heterogeneous network into a semantic homogeneous network and retains semantic information of the heterogeneous network through relation representations between nodes, reducing the complexity of network structure modeling. Then, based on the message passing framework, a graph convolution node classification method is designed to learn network structure information on the semantic homogeneous network, such as neighbor weights without 1-sum constraint, to discover the differences of relations and neighbor semantic extraction. Finally, heterogeneous node representations are generated and used to classify nodes to identify node category labels. Experiments on three public node classification datasets show that HNNCF can make the best of heterogeneous semantic information and effectively learn network structure information such as reasonable neighbor weights to improve the performance of heterogeneous network node classification.
Abstract:
Under the condition of few-shot, due to the problem of low data, in other words, the labeled data is rare and difficult to gather, it is very difficult to train a good classifier by traditional deep learning. In recent researches, the method based on measuring low level local information and TPN(transductive propagation network) has achieved good classification results. Moreover, local information can measure the relation between features well, but the problem of low data still exists. In order to solve the issue of low data, MSLPN (multi-scale label propagation network) based on TPN is proposed in this paper. The core idea of the method is to use a multi-scale generator to generate image features of multiple scales, obtain the similarity scores of samples with different scale features through the relational measurement module, and obtain classification results by integrating similarity scores at different scales. Specifically, the method firstly generates multiple image features of different scales through a multi-scale generator. And then, the similarity scores of the multi-scale information are used for label propagation. Finally, classification results are obtained by calculating the multi-scale label propagation results. Compared with TPN, in miniImageNet, the classification accuracy of 5-way 1-shot and 5-way 5-shot settings is increased by 2.77% and 4.02% respectively. While in tieredImageNet, the classification accuracy of 5-way 1-shot and 5-way 5-shot settings is increased by 1.16% and 1.27% respectively. The experimental results show that the proposed method in this paper can effectively improve the classification accuracy by using multi-scale feature information.
Abstract:
As it becomes increasingly easier to obtain multi-modal or multi-view data, multi-view clustering has gained much more attention recently. However, many methods learn the affinity matrix from the original data and may lead to unsatisfying results because of the noise in the raw dataset. Besides, some methods neglect the diversity of roles played by different views and take them equally. In this paper, we propose a novel Markov chain algorithm named consensus guided auto-weighted multi-view clustering (CAMC) to tackle these problems. A transition probability matrix is constructed for each view to learn the affinity matrix indirectly to reduce the effects of redundancies and noise in the original data. The consensus transition probability matrix is obtained in an auto-weighted way, in which the optimal weight for each view is gained automatically. Besides, a constrained Laplacian rank is utilized on the consensus transition probability to ensure that the number of the connected components in the Laplacian graph is exactly equal to that of the clusters. Moreover, an optimization strategy based on alternating direction method of multiplier (ADMM) is proposed to solve the problem. The effectiveness of the proposed algorithm is verified on a toy dataset. Extensive experiments on seven real-world datasets with different types show that CAMC outperforms the other eight benchmark algorithms in terms of clustering.
Abstract:
Heterogeneous network embedding based recommendation technology has the capability to capture the structural information in the network effectively, thus improving the recommendation performance. However, the existing recommendation technology based on heterogeneous network embedding not only ignores the attribute information of nodes and various types of edge relations between nodes, but also ignores the diverse influences of different nodes’ attribute information on recommendation results. To address the above issues, a product recommendation framework based on attributed heterogeneous information network embedding with self-attention mechanism (AHNER) is proposed. The framework utilizes attributed heterogeneous information network embedding to learn the unified low-dimensional embedding representations of users and products. When learning node embedding representation, considering that different attribute information has different effects on recommendation results and different edge relations between nodes reflect users’ different preferences for products, self-attention mechanism is exploited to mine the latent information of node attribute information and different edge types and learn attribute embedding representation is learned. Meanwhile, in order to overcome the limitation of traditional dot product method as matching function, the framework also exploits deep neural network to learn more effective matching function to solve the recommendation problem. We conduct extensive experiments on three public datasets to evaluate the performance of AHNER. The experimental results reveal that AHNER is feasible and effective.
Abstract:
Since the pedestrian images taken by the monitoring equipment in natural scenes are always occluded by various obstacles, occlusions is a great challenge for person re-identification. For the above problems, a spatial attention and pose estimation (SAPE) is proposed. In order to give consideration to both global and local features, a multi-task network is constructed to realize multi-granularity representation of features. By means of spatial attention mechanism, the region of interest is directed to the spatial semantic information in the image, and the visual knowledge which is helpful for re-identification is mined from the global structural pattern. Then, combined with the idea of part matching, the feature map extracted from the residual network is evenly divided into several parts horizontally, and the identification granularity is increased by matching the local features. On this basis, the key information of pedestrians in the image extracted by the improved pose estimator is fused with the feature map extracted by the convolutional neural network, and the threshold is set to remove the occlusion area, and the features with strong identification are obtained, so as to eliminate the influence of occlusion on the re-identification results. We verify the effectiveness of the SAPE model on three datasets of Occluded-DukeMTMC, Occluded-REID and Partial-REID. The experimental results show that SAPE has achieved good experimental results.
Abstract:
Ride-sharing can effectively improve the utilization of transportation resources, decrease travel costs, alleviate traffic congestion, and reduce environmental pollution. Aiming at the dynamic ride-sharing problem, an integer linear programming model is constructed, and a bimodal cooperative matching algorithm based on offline and online matching is proposed. In the offline stage, the sharing route percentage and the detour length are adopted to evaluate the matching value, and a general sharing route percentage algorithm based on the weighted path search tree is designed to perform accurate pre-matching of the participants. In the online stage, a real-time order insertion algorithm based on the complex location to destination is proposed, and the routes obtained in the offline matching stage are further improved. Through the bimodal cooperation, the real-time performance and solution quality of the proposed algorithm can be significantly augmented. Finally, a large number of experiments based on real-world data are performed. The results show that the overall sharing value and efficiency of the proposed algorithm surpass those of the comparative algorithm. The average offline matching rate and the average bimodal cooperative matching rate reach 93.71% and 85.53%, respectively, while the transportation efficiency is improved by 82.86% and the vehicle concurrency is reduced by 84.86%.
Abstract:
With the continuous growth and development of Web information, the prediction of users’ sparse behavior has become a research hotspot in recommender systems. Recently, factorization machine (FM) is proposed to alleviate the problem of inaccurate prediction accuracy to a certain extent in sparse datasets. The main idea of FM is to capture rich semantic relations with second-order feature interactions. Subsequently, inspired by feature interactions of FM, interaction-aware factorization machine (IFM) introduces the concept of field interaction to obtain more accurate predictions, and its primary motivation is combining feature interactions with field interactions to expand the potential interaction characteristics. Based on IFM, we propose a feature-over-field interaction factorization machine (FIFM), which is constructed on the basis of feature interactions and field interactions, and design a feature-over-field interaction mechanism (FIM) to exploit the effectively predictive signals hidden in the interaction context. Then, fusing interactive-aware method is adapted to predict users’ behaviors in different sparse scenarios. Besides, we propose a neural network version based on deep learning named generalized feature-field interaction model (GFIM) to further extract more nonlinear higher-order interaction signals, which consumes more parameters as well as has higher time complexity, and could be used in the high computational scenarios. Extensive experiments on four real-world datasets show that our proposed approaches FIFM and GFIM outperform the state-of-the-art method IFM in the metric of RMSE. Moreover, we conduct comprehensive experiments among various sparse datasets, where the time and space complexity are also analyzed.
Abstract:
Essential proteins, as the essential substances in proteins, are not only of great importance in studying the regulation of cell growth, but also lay a theoretical foundation for the further study of diseases. At present, most of the methods for protein identification are static and dynamic network methods based on gene expression information and protein-protein interaction (PPI) network, but these methods do not consider the periodicity of gene expression regulation, and cannot accurately describe the protein networks periodically regulated by genes. Therefore, the concept of periodic gene expression is introduced on the basis of dynamic gene expression, and a dynamic network segmentation method is proposed. In this method, the noise data in the gene expression data is filtered by constructing the gene “active” expression matrix and the expression at each moment is classified into “active” and “inactive” expression states. The periods are divided according to the gene “active” expression matrix to characterize the dynamic changes of gene expression over continuous time periods. The segmented “active” expression matrix is applied to act on the protein-protein interaction network to generate the protein periodic subnetworks. Finally, the importance of the protein nodes in the network is measured by integrating each protein periodic subnetwork. The experimental results show that the method can effectively improve the prediction rate of essential proteins in yeast, E.coli and human bladder data.
Abstract:
Network anomaly detection is essential for network management and network security. Over the years, a large number of domestic and foreign documents have proposed a series of network anomaly detection methods, most of which focus on the analysis, detection and warning of data packets and independent time series data streams. This kind of method only uses the temporal correlation between network data and it is difficult to detect new types of network anomalies, locate and eliminate abnormal data. In order to solve the above problems, some literatures integrate multiple time series data streams and study network anomaly detection methods based on low-rank decomposition. These methods make full use of the spatio-temporal correlation between network data, and they could locate the location of abnormal data without supervision, and eliminate the abnormal data at the same time, so as to restore the normal data of the network. We firstly analyze the anomaly detection methods based on low-rank decomposition. The methods are divided into four categories according to its different constraints on normal data and abnormal data, and the basic ideas, advantages and disadvantages of each method are introduced. Then, the challenges of existing anomaly detection methods based on low-rank decomposition are analyzed. Finally, the possible future development trends are predicted.
Abstract:
Aiming at the problem that the existing local encoding mechanisms and perturbation mechanisms cannot preserve the distance between neighbor locations when collecting the spatial data, we propose two efficient algorithms, called PELSH and PULSH, which are based on locality-sensitive hashing(LSH) structure, to respond kNN queries. The two algorithms employ multiple hashing tables with multiple hashing functions to index the locations of all users, on which are relied to answer kNN queries. Based on the hashing tables copied from the collector, each user firstly transforms his/her location into 0/1 string with Hamming embedding algorithm and then uses LSH to compress the Hamming code. Finally, the user locally runs GRR and bit perturbation mechanism on the compressed 0/1 string and reports the perturbed value to the collector. The collector accumulates the reports from all users to reconstruct hashing tables that are traveled to get the approximate kNN queries. Furthermore, in PELSH and PULSH, we use privacy budget partition and user partition strategies to design four local algorithms, called PELSHB, PELSHG, PULSHB, and PULSHG to perturb user data. PELSH and PULSH are compared with existing algorithms in the large-scale real datasets. The experimental results show PELSH and PULSH outperform their competitors, achieve the accurate results of spatial kNN queries.
Abstract:
The emerging technologies about big data enable many organizations to collect massive amount information about individuals. Sharing such a wealth of information presents enormous opportunities for data mining applications, data privacy has been a major barrier. k-anonymity based on clustering is the most important technique to prevent privacy disclosure in data-sharing, which can overcome the threat of background based attacks and link attacks. Existing anonymity methods achieve the balance with privacy and utility requirements by seeking the optimal k-equivalence set. However, viewing the results as a whole, k-equivalent set is not necessarily the optimal solution satisfying k-anonymity so that the utility optimality is not guaranteed. In this paper, we endeavor to solve this problem by using optimal clustering approach. We follow this idea and propose a greedy clustering-anonymity method by combining the greedy algorithm and dichotomy clustering algorithm. In addition, we formulate the optimal data release problem that minimizes information loss given a privacy constraint. We also establish the functional relationship between data distance and information loss to capture the privacy/accuracy trade-off process in an online way. Finally, we evaluate the mechanism through theoretic analysis and experiments verification. Evaluations using real datasets show that the proposed method can minimize the information loss and be effective in terms of running time.