ISSN 1000-1239 CN 11-1777/TP


    Default Latest Most Read
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Journal of Computer Research and Development    2019, 56 (8): 1604-1604.  
    Abstract2288)   HTML233)    PDF (146KB)(910)       Save
    Related Articles | Metrics
    Program Comprehension Based on Deep Learning
    Liu Fang, Li Ge, Hu Xing, Jin Zhi
    Journal of Computer Research and Development    2019, 56 (8): 1605-1620.   DOI: 10.7544/issn1000-1239.2019.20190185
    Abstract2040)   HTML44)    PDF (1562KB)(1609)       Save
    Program comprehension is the process of obtaining relevant information in programs by analyzing, abstracting, and reasoning the programs. It plays an important role in software development, maintenance, migration, and other processes. It has received extensive attention in academia and industry. Traditional program comprehension relies heavily on the experience of developers. However, as the scale and complexity of software continue to grow, it is time-consuming and laborious to rely solely on the developer’s prior knowledge to extract program features, and it is difficult to fully exploit the hidden features in the program. Deep learning is a data-driven end-to-end method. It builds deep neural networks based on existing data to mine the hidden features in data, and has been successfully applied in many fields. By applying deep learning technology to program comprehension, we can automatically learn the features implied in programs, which can fully exploit the knowledge implied in the program and improve the efficiency of program comprehension. This paper surveys the research work of program comprehension based on deep learning in recent years. Firstly, we analyze the properties of the program, and then introduce mainstream program comprehension models, including sequential models, structural models, and execution traces based models. Furthermore, the applications of deep learning-based program comprehension in program analysis are introduced, which mainly focus on code completion, code summarization and code search, etc. Finally, we summarize the challenges in program comprehension research.
    Related Articles | Metrics
    Predicting the Dynamics in Internet Finance Based on Deep Neural Network Structure
    Zhao Hongke, Wu Likang, Li Zhi, Zhang Xi, Liu Qi, Chen Enhong
    Journal of Computer Research and Development    2019, 56 (8): 1621-1631.   DOI: 10.7544/issn1000-1239.2019.20190330
    Abstract1348)   HTML39)    PDF (1261KB)(867)       Save
    In recent years, the Internet financial market has achieved rapid development across the globe. In the meantime, Internet finance has become a hot topic in academia. Compared with traditional financial markets, the Internet financial market has higher liquidity and volatility. In this paper, the dynamics (daily trading amount and count) of the Internet financial market is studied and a prediction model is proposed based on deep neural network for fusion hierarchical time series learning. Firstly, the model can process the multiple sequence (macro dynamic sequence and multiple subsequences) feature as the input variables. And then, an attention mechanism is proposed to fuse the input variables from both the time and subsequence feature dimensions. Next, the model designs an optimization function based on the stability constraint of the sequence prediction, which makes the model have better robustness. Finally, a large number of experiments have been carried out on real large-scale data sets, and the results have fully proved the effectiveness and robustness of the proposed model in the dynamic prediction of Internet finance market.
    Related Articles | Metrics
    Person Re-Identification Based on Deep Convolutional Generative Adversarial Network and Expanded Neighbor Reranking
    Dai Chenchao, Wang Hongyuan, Ni Tongguang, Chen Shoubing
    Journal of Computer Research and Development    2019, 56 (8): 1632-1641.   DOI: 10.7544/issn1000-1239.2019.20190195
    Abstract1209)   HTML21)    PDF (3089KB)(572)       Save
    Person Re-Identification (Re-ID) focuses on identifying the same person among disjoint camera views. This task is highly challenging, especially when there exists only several images per person in the database. Aiming at the problem of insufficient number of person images in person re-identification dataset, a method that generates extra training data from the original dataset is proposed. There are two challenges in this work, one is how to get more training data from the original training set, and the other is how to deal with these newly generated training data. The deep convolutional generative adversarial network is used to generate extra unlabeled person images and label smoothing regularization is used to process these newly generated unlabeled person images. In order to further improve the accuracy of person re-identification, a new unsupervised reranking framework is proposed. This framework neither requires to recalculate a new sorted list for each image pairs nor requires any human interaction or label information. Experiments on the datasets Market-1501, CUHK03, and DukeMTMC-reID verify the effectiveness of the proposed method.
    Related Articles | Metrics
    Non-Stationary Multivariate Time Series Prediction with MIX Gated Unit
    Liu Jiexi, Chen Songcan
    Journal of Computer Research and Development    2019, 56 (8): 1642-1651.   DOI: 10.7544/issn1000-1239.2019.20190326
    Abstract1295)   HTML14)    PDF (2185KB)(710)       Save
    Non-stationary multivariate time series (NSMTS) forecasting is still a challenging issue nowadays. The existing deep learning models based on recurrent neural networks (RNNs), especially long short-term memory (LSTM) and gated recurrent unit (GRU) neural networks, have received impressive performance in prediction. Although the architecture of the LSTM is relatively complex, it cannot always dominate in performance. Latest researches show that with a simpler gated unit structure, the minimal gated unit (MGU) can not only simplify the network architecture, but also improve the training efficiency in computer vision and some sequence problems. Most importantly, our experiments show that this kind of unit can be effectively applied to the NSMTS predictions and achieve comparable results with LSTM and MGU neural networks. However, none of the three gated unit based neural networks can always dominate in performance over all the NSMTS. Therefore, in this paper we propose a novel linear MIX gated unit (MIXGU). This gated unit can adjust the importance weights of GRU and MGU dynamically to achieve a better hybrid structure for each MIXGU in the network during training. The experimental results show that this MIXGU neural network has higher prediction performance than other state-of-the-art one gated unit neural network models.
    Related Articles | Metrics
    Prediction of miRNA-lncRNA Interaction by Combining CNN and Bi-LSTM
    Shi Wenhao, Meng Jun, Zhang Peng, Liu Chanjuan
    Journal of Computer Research and Development    2019, 56 (8): 1652-1660.   DOI: 10.7544/issn1000-1239.2019.20190128
    Abstract1964)   HTML27)    PDF (2142KB)(855)       Save
    Non-coding RNA (ncRNA) plays an important regulatory role in many animal and plant life activities, and the interaction of microRNA (miRNA) and long non-coding RNA (lncRNA) is more important. The study of their interaction not only helps to analyze the biological functions of genes, but also provides new ideas for disease diagnosis and treatment and plant genetic breeding. At present, biological experiments and machine learning methods are mostly used to predict miRNA-lncRNA interaction. Due to high cost and time consuming of biological identification and the excessive manual intervention of machine learning and the complex feature extraction process, a deep learning model combining convolutional neural network (CNN) and bidirectional long short-term memory network (Bi-LSTM) is proposed. It combines the advantages of two models, considering the information correlation between sequences and combining context information, and fully extracting features between sequence data. In the experiment, the performance of model is evaluated by cross-validation, compared with the traditional machine learning methods and single model on zea mays dataset, and the superior classification effect is obtained. In addition, the model tests of solanum tuberosum and triticum aestivum species are carried out, and the accuracy rates are up to 95% and 93%, respectively, which verifies good generalization ability of the model.
    Related Articles | Metrics
    An Integrated Recommendation Model Based on Two-stage Deep Learning
    Wang Ruiqin, Wu Zongda, Jiang Yunliang, Lou Jungang
    Journal of Computer Research and Development    2019, 56 (8): 1661-1669.   DOI: 10.7544/issn1000-1239.2019.20190178
    Abstract1675)   HTML16)    PDF (1082KB)(606)       Save
    In recent years, deep learning technology has been widely used in the field of recommendation systems and has achieved great success. However, the input quality of the deep learning models has a great influence on the learning results. A sparse input feature vector will not only increase the difficulty of subsequent model training, but also will lead to the learning results falling into local optimum. In this article, an integrated recommendation model based on two-stage deep learning is proposed. Firstly, two individual marginal stacked denoising auto-encoders (mSDA) models with closed-form parameter calculation are used to extract the high-level abstract features of the users and the items. Then the resulted user abstract feature and the item abstract feature are connected as the input vector of the deep neural network (DNN) model, and the parameter learning and model optimization are performed through joint training. In addition, in order to model low-order feature interactions, a logistic regression model based on original feature vector is also integrated into the recommendation model. Extensive experiments with two real-world datasets indicate that the proposed recommendation model shows excellent recommendation performance compared with the state-of-the-art methods, especially in the data sparse and the cold start environments.
    Related Articles | Metrics
    Deep Forest for Multiple Instance Learning
    Ren Jie, Hou Bojian, Jiang Yuan
    Journal of Computer Research and Development    2019, 56 (8): 1670-1676.   DOI: 10.7544/issn1000-1239.2019.20190332
    Abstract1233)   HTML7)    PDF (1048KB)(566)       Save
    Multi-instance learning has been applied to various tasks, such as image retrieval, text classification, face recognition, etc. Deep neural network has also been successfully applied to plenty of tasks and problems. MI-Nets are one of the successful applications to multi-instance learning of deep neural network. Although MI-Nets have obtained achievements and the main task they are good at is image task, while on non-image tasks, they show mediocre performance. Over the last two years, deep forest has achieved good performance on non-image tasks and is favored for its less parameters and steady performance compared with deep neural network. Thus it is urgent and necessary to apply deep forest to multi-instance learning. However, due to the limitation of the structure of deep forest, we cannot simply substitute the bag-level forest for each forest of deep forest. Therefore, we need to change the structure of deep forest to achieve our purpose. In this paper, we provide a new structure of deep forest, that is multiple instance deep forest (MIDF). We regard each instance from a bag as a new bag, and thus the distribution output from the middle level can concatenate the original bag to make the cascade structure valid. We can also assure the number of layers of MIDF. Experimental results show that our method has comparable performance with MI-Nets on image task, while on non-image tasks, our method outperforms MI-Nets and other baseline methods.
    Related Articles | Metrics
    Sample-Weighted Multi-View Clustering
    Hong Min, Jia Caiyan, Li Yafang, Yu Jian
    Journal of Computer Research and Development    2019, 56 (8): 1677-1685.   DOI: 10.7544/issn1000-1239.2019.20190150
    Abstract1596)   HTML40)    PDF (978KB)(794)       Save
    In the era of big data, the ability of humans to collect, store, transmit and manage data has been increasingly improved. Various industries have accumulated a large amount of data resources, which are often multi-source and heterogeneous. How to effectively cluster these multi-source data (also known as multi-view clustering) has become one of the focuses of today’s machine learning research. The existing multi-view clustering algorithms mainly pay attention to the contribution of different views and features to the cluster structure from the “global” perspective, without considering the “local” information complementary differences between different samples. Therefore, this paper proposes a new sample-weighted multi-view clustering (SWMVC). The method weights each sample with different views and adopts alternating direction method of multipliers (ADMM) to learn sample weight, which can not only learn the “local” difference of weights among multiple views in different sample points, but also reflect the “global” difference of the contribution of different views to the cluster structure, and has better flexibility. Experiments on multiple datasets show that the SWMVC method has a better clustering effect on heterogeneous view data.
    Related Articles | Metrics
    Optimal Individual Convergence Rate of the Heavy-Ball-Based Momentum Methods
    Cheng Yujia, Tao Wei, Liu Yuxiang, Tao Qing
    Journal of Computer Research and Development    2019, 56 (8): 1686-1694.   DOI: 10.7544/issn1000-1239.2019.20190167
    Abstract1217)   HTML9)    PDF (2703KB)(439)       Save
    The momentum method is widely used as an acceleration technique to improve the convergence of the first-order gradient algorithms. So far, the momentum methods discussed in most literatures are only limited to the accelerated method proposed by Nesterov, but the Heavy-ball momentum method proposed by Polyak is seldom studied. In particular, in the case of non-smooth objective functions, the individual optimal convergence of Nesterov accelerated methods has been derived, and it has high performance in solving sparse optimization problems. In contrast, while it has been proved that the Heavy-ball momentum method has an optimal convergence rate,it is only in terms of the averaged outputs. To our best knowledge, whether it has optimal individual convergence or not still remains unknown. In this paper, we focus on the non-smooth optimizations. We prove that the Heavy-ball momentum method achieves the optimal individual convergence by skillfully selecting the time-varying step-size, which indicates that Heavy-ball momentum is an efficient acceleration strategy for the individual convergence of the projected subgradient methods. As an application, the constrained hinge loss function optimization problems within an l\-1-norm ball are considered. In comparison with other optimization algorithms, the experiments demonstrate the correctness of our theoretical analysis and performance of the proposed algorithms in keeping the sparsity.
    Related Articles | Metrics
    An Adaptive Regression Feature Selection Method for Datasets with Outliers
    Guo Yaqing, Wang Wenjian, Su Meihong
    Journal of Computer Research and Development    2019, 56 (8): 1695-1707.   DOI: 10.7544/issn1000-1239.2019.20190313
    Abstract1010)   HTML9)    PDF (1355KB)(462)       Save
    Irrelevant and redundant features embedded in data will raise the difficulty for learning tasks, and feature selection can solve this problem effectively and improve learning efficiency and learner performance. Most of existing feature selection approaches are proposed for classification problems, while there are few studies on regression problems. Eespecially in presence of outliers, the present methods do not perform well. Although some methods can increase their robustness by weighting sample loss functions, the weights are set in advance and fixed throughout feature selection and learner training, which leads to bad adaptability. This paper proposes a regression feature selection method named adaptive weight LASSO (AWLASSO) for outliers. Firstly, it updates sample errors according to regression coefficients. Then the weights for loss functions of all samples are set according to the adaptive regularization term, i.e., the loss functions of samples whose errors are larger than current threshold are set smaller weights and loss functions of samples whose errors are less than threshold are set larger weights. The regression coefficient will be estimated iteratively under weighted loss function whose weights are updated. AWLASSO controls whether samples participate in regression coefficient estimation by the threshold. Only those samples with small errors participate in estimation, so a better regression coefficient estimation may be obtained in the end. In addition, the error threshold of AWLASSO algorithm is not fixed but increasing(To make initial regression coefficient estimation be accurate, initial threshold is often smaller). So some samples which are misjudged as outliers will have chance to be added again in training set. The AWLASSO regards samples whose errors are larger than the maximum threshold as outliers for their learning cost is bigger, and the weights of their loss functions are set to 0. Hence, the influence of outliers will be reduced. Experiment results on artificial data and benchmark datasets demonstrate that the proposed AWLASSO has better robustness and sparsity specially for datasets with outliers in comparison with classical methods.
    Related Articles | Metrics
    An Experience-Guided Deep Deterministic Actor-Critic Algorithm with Multi-Actor
    Chen Hongming, Liu Quan, Yan Yan, He Bin, Jiang Yubin, Zhang Linlin
    Journal of Computer Research and Development    2019, 56 (8): 1708-1720.   DOI: 10.7544/issn1000-1239.2019.20190155
    Abstract1139)   HTML9)    PDF (4919KB)(519)       Save
    The continuous control task has always been an important research direction in reinforce-ment learning. In recent years, the development of deep learning (DL) and the advent of deterministic policy gradients algorithm (DPG), provide many good ideas for solving continuous control problems. The main difficulty faced by these methods is the exploration in the continuous action space. And some of them engage in exploratory behavior through external noise injection in the action space. However, this exploration method does not perform well in some continuous control tasks. This paper proposes an experience-guided deep deterministic actor-critic algorithm with multi-actor (EGDDAC-MA) without external noise, which learns a guiding network from excellent experiences to guide the updates of the actor network and the critic network. Besides, it uses a multi-actor actor-critic (AC) model which configures different actors for each phase in an episode. These actors are independent of each other and do not interfere with each other. Finally, the experimental results show that compared with DDPG, TRPO and PPO algorithms, the proposed algorithm has better performance in most continuous tasks in GYM simulation platform.
    Related Articles | Metrics
    Scene Graph Generation Based on Shuffle Residual Context Information
    Lin Xin, Tian Xin, Ji Yi, Xu Yunlong, Liu Chunping
    Journal of Computer Research and Development    2019, 56 (8): 1721-1730.   DOI: 10.7544/issn1000-1239.2019.20190329
    Abstract820)   HTML3)    PDF (4604KB)(424)       Save
    Scene graphs play an important role in visual understanding. Existing scene graph generation methods focus on the research of the subjects, the objects as well as the predicates between them. However, human being abstracts the relationships using spatial relation context, semantic context and interaction between scene objects for better understanding and reasoning as whole. In order to obtain the better global context representation and reduce the impact of dataset bias, we propose a new framework of scene graph generation, called as residual shuffle sequence model (RSSQ). Our method is made up of object decoding, residual shuffle and position embedding modules. Residual shuffle module is stacked with two basic structures including the random shuffle operation and the residual bidirectional LSTM. We implement the random shuffle on the hidden state of bidirectional LSTM by the process of iterative operation to reduce the impact of dataset bias, and extract the shared global context information by the residual connection structure. To strengthen the spatial relationship between pair-wise objects, the encoding is achieved using the relative position and area ratio of objects in position embedding module. The experimental results of three sub-tasks of different difficulty performed on Visual Genome dataset, demonstrate that the poposed method can generate better scene graphs under Recall@50 and Recall@100 settings due to better global context and spatial information.
    Related Articles | Metrics
    Aspect-Level Sentiment Classification for Sentences Based on Dependency Tree and Distance Attention
    Su Jindian, Ouyang Zhifan, Yu Shanshan
    Journal of Computer Research and Development    2019, 56 (8): 1731-1745.   DOI: 10.7544/issn1000-1239.2019.20190102
    Abstract1445)   HTML25)    PDF (1527KB)(687)       Save
    Current attention-based approaches for aspect-level sentiment classification usually neglect the contexts of aspects and the distance feature between words and aspects, which as a result make it difficult for attention mechanism to learn suitable attention weights. To address this problem, a dependency tree and distance attention-based model DTDA for aspect-level sentiment classification is proposed. Firstly, DTDA extracts dependency subtree (aspect sub-sentence) that contains the modification information of the aspect with the help of dependency tree of sentences, and then uses bidirectional GRU networks to learn the contexts of sentence and aspects. After that, the position weights are determined according to the syntactic distance between words and aspect along their path on the dependency tree, which are then further combined with relative distance to build sentence representations that contain semantic and distance information. The aspect-related sentiment feature representations are finally generated via attention mechanism and merged with sentence-related contexts, which are fed to a softmax layer for classification. Experimental results show that DTDA achieves comparable results with those current state-of-the-art methods on the two benchmark datasets of SemEval 2014, Laptop and Restaurant. When using word vectors pre-trained on domain-relative data, DTDA achieves the results with the precision of 77.01% on Laptop and 81.68% on Restaurant.
    Related Articles | Metrics
    Negatively Correlated Search with Asymmetry for Real-Parameter Optimization Problems
    Yu Runlong, Zhao Hongke, Wang Zhong, Ye Yuyang, Zhang Peining, Liu Qi, Chen Enhong
    Journal of Computer Research and Development    2019, 56 (8): 1746-1757.   DOI: 10.7544/issn1000-1239.2019.20190198
    Abstract1150)   HTML7)    PDF (4572KB)(305)       Save
    As many real-world applications are closely related to complex real-parameter optimization problems, some metaheuristic assumptions are employed to help design search strategies and have been shown to be powerful tools. The balance between exploration (diversification) of new areas of the search space and exploitation (intensification) of good solutions accomplished by this kind of algorithms is one of the key factors for their high performance with respect to other metaheuristics. In particular, negatively correlated search (NCS) improves the search performance of parallel hill climbing by introducing negative correlation of search trends between search processes, which contributes greatly to the diversity maintenance of solutions. NCS models the search behaviors of individual search processes as probability distributions. On this basis, we further divide the search behaviors of a couple of search processes into global search behavior and local search behavior according to the size of the coverage of each search process. Then we present a new metaheuristic, namely negatively correlated search with asymmetry (NSA), which assumes that the search process with global search behavior should be away from the search process with local search behavior. Due to the asymmetry of the negative correlation between search processes, the efficiency of NSA has been greatly improved compared with NCS. The experimental results show that NSA is competitive to well-established search methods in the sense that NSA achieves the best overall performance on 20 multimodal real-parameter optimization problems.
    Related Articles | Metrics
    Low-Redundancy Knowledge Graph Management with Range Query Support
    Wang Fei, Qian Tieyun, Liu Bin, Peng Zhiyong
    Journal of Computer Research and Development    2019, 56 (8): 1758-1771.   DOI: 10.7544/issn1000-1239.2019.20190169
    Abstract1081)   HTML10)    PDF (1564KB)(829)       Save
    As more and more data is published in the form of knowledge graph, the management of which attracts a lot of attention. Existing approaches for knowledge graph management have two drawbacks: 1) logical storage modeling generates lots of redundancy and ineffectively supports range queries on continuous attributes; 2) semantic storage modeling costs much and inefficiently adapts to the dynamic evolution of knowledge graph. In this paper, we propose a novel method called cluster object deputy model (CODM) to manage knowledge and metadata. The model has two key properties, namely logical storage modeling of schema and semantic storage modeling of lightweight. To this end, we design a schema cluster algorithm based on the set editing distance to convert knowledge graph into schema data, which realizes schema storage of data and supports index specification of attribute type. Besides, CODM constructs a class hierarchical system to model different associations among entities. It adopts object pointers to achieve the lightweight materialization of generalized semantic association. Experimental results show that CODM can tremendously reduce the data redundancy and outperforms the state-of-the-art methods in terms of range queries. And those results also indicate that CODM can accelerate the processing of complex queries.
    Related Articles | Metrics