ISSN 1000-1239 CN 11-1777/TP


    Default Latest Most Read
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Journal of Computer Research and Development    2018, 55 (8): 1597-1598.  
    Abstract1487)   HTML34)    PDF (407KB)(926)       Save
    Related Articles | Metrics
    Stealing Behavior Recognition Method Based on Electricity Characteristics Analysis
    Shi Yuliang, Rong Yiping,Zhu Weiyi
    Journal of Computer Research and Development    2018, 55 (8): 1599-1608.   DOI: 10.7544/issn1000-1239.2018.20180216
    Abstract1851)   HTML27)    PDF (2681KB)(916)       Save
    Anti-stealing electricity is an indispensable component of electricity enterprise management. In view of the current problems such as the large number of users, the wide distribution area, the increasing year-on-year power stealing, and the lack of supervision personnel, this paper analyzes and handles the data of power stealing behavior, and proposes a stealing behavior recognition method which can identify the stealing users. First, based on the collected samples, this method adopts a filtering algorithm and a regular threshold to implement feature extraction, so as to improve the effectiveness of the collected data. Then, the user’s stealing behavior diagnosis model is constructed based on the logistic regression algorithm to realize the determination of suspected users. In addition, this paper uses the closed-loop working mechanism which continues to update data by the way of pushing, shooting, processing and feedback to continuously optimize the model. According to the collected data provided by the power consumption information collection system and marketing business application system of State Grid Shandong Electric Power Company, the experimental results prove the feasibility and applicability of the method.
    Related Articles | Metrics
    The Automatic Identification of Butterfly Species
    Xie Juanying, Hou Qi, Shi Yinghuan, Lü Peng, Jing Liping, Zhuang Fuzhen, Zhang Junping, Tan Xiaoyang,Xu Shengquan
    Journal of Computer Research and Development    2018, 55 (8): 1609-1618.   DOI: 10.7544/issn1000-1239.2018.20180181
    Abstract2625)   HTML37)    PDF (4187KB)(865)       Save
    The available butterfly image data sets comprise a few limited species, and the images in the data sets are always standard patterns without the images of butterflies in their living environments. To overcome the aforementioned limitations in the butterfly image data sets, we build a butterfly image data set composed of all species of butterflies in Monograph of Chinese butterflies with 4270 standard pattern images of 1176 butterfly species, and 1425 butterfly images from living environment of 111 species. We use the deep learning technique Faster R-CNN to develop an automatic butterfly identification system including butterfly position detection in images from living environment and species recognition. We delete those butterfly species with only one living environment image from data set, then partition the rest butterfly images from living environment into two subsets in half-half partition way, such that one is used as testing subset, and the other is respectively combined with all standard patterns of butterfly images or the standard patterns of butterfly images with the same species as the images from living environment to get two different training subsets. In order to construct the training subset for Faster R-CNN, nine methods are adopted to amplify the images in the training subset including the turning of up and down, and left and right, rotation with different angles, adding noises, blurring, and contrast ratio adjusting etc. Three kinds of network structure based prediction models are trained. The mAP (mean average prediction) criterion is used to evaluate the performance of the predictive models. The experimental results demonstrate that our Faster R-CNN based butterfly automatic identification system performs well. Its worst mAP is up to 60%, and it can simultaneously detect the positions of more than one butterflies in one image from living environment and can recognize their species as well.
    Related Articles | Metrics
    An Overlapping Community Detection Algorithm Based on Centrality Measurement of Network Node
    Du Hangyuan, Wang Wenjian,Bai Liang
    Journal of Computer Research and Development    2018, 55 (8): 1619-1630.   DOI: 10.7544/issn1000-1239.2018.20180187
    Abstract1433)   HTML9)    PDF (2359KB)(1355)       Save
    Based on the idea of density peak clustering method, a centrality measurement model for network nodes is designed, and a new community detection algorithm for overlapping network is also proposed. In the algorithm, the cohesion and separation of network nodes are defined at first, to describe the structural feature of community that the intra links inside one community are dense while the inter links between communities are sparse. Depend on that, centrality measurement is calculated for each node to express its influence on network community structure. Then the nodes with tremendous centralities are selected by the 3δ principle as community centers. The overlapping features between communities are represented by memberships, and the iterative calculation methods for the memberships of non-central nodes are put forward. After that, according to their memberships, all the nodes in network can be allocated to their possible communities to accomplish the overlapping community detection. At last, the proposed algorithm is verified by the simulation on both synthetic networks and social networks. The simulation results reflect that our algorithm outperforms other competitive overlapping community detection algorithms in respect of both detection quality and computational efficiency.
    Related Articles | Metrics
    A Distributed Representation Model for Short Text Analysis
    Liang Jiye, Qiao Jie, Cao Fuyuan,Liu Xiaolin
    Journal of Computer Research and Development    2018, 55 (8): 1631-1640.   DOI: 10.7544/issn1000-1239.2018.20180233
    Abstract1484)   HTML13)    PDF (2015KB)(620)       Save
    The distributed representation of short texts has become an important task in text mining. However, the direct application of the traditional Paragraph Vector may not be suitable, and the fundamental reason is that it does not make use of the information of corpus in training process, so it can not effectively improve the situation of insufficient contextual information in short texts. In view of this, in this paper we propose a novel distributed representation model for short texts called BTPV (biterm topic paragraph vector). BTPV adds the topic information of BTM (biterm topic model) to the Paragraph Vector model. This method not only uses the global information of corpus, but also perfects the implicit vector of Paragraph Vector with the explicit topic information of BTM. At last, we crawl popular news comments from the Internet as experimental data sets, using K-Means clustering algorithm to compare the models’ representation performance. Experimental results have shown that the BTPV model can get better clustering results compared with the common distributed representation models such as word2vec and Paragraph Vector, which indicates the advantage of the proposed model for short text analysis.
    Related Articles | Metrics
    The Social and Conceptual Semantic Extended Search Method for Microblog Short Text
    Cui Wanqiu, Du Junping, Kou Feifei, Li Zhijian,Lee JangMyung
    Journal of Computer Research and Development    2018, 55 (8): 1641-1652.   DOI: 10.7544/issn1000-1239.2018.20180363
    Abstract1035)   HTML9)    PDF (3817KB)(720)       Save
    Mining the semantics of the microblog texts to realize accurate search is an essential task in microblog search. Because the content of the short texts in microblog has the characteristics of sparsity and semantic limitation, the traditional search methods which only analyze the semantics of literal text for short texts understanding and similarity matching have certain restriction. Therefore, we propose an extended search algorithm based on social and conceptual semantics. By exploiting the unique social attributes such as the #hashtag#, the mention “@” and the link information URL in the social network, we further extend the short texts in microblog through the social semantics. The method combines the conceptual words obtained from literal analysis of short texts with the potential associated hashtags information in a graph structure formed by social relationships. It performs the feature representation of short texts in two semantic extensions and achieves the precise search based on full mining of short texts meaning. Finally, we conduct experimental comparisons with traditionally extended search algorithms in the microblog datasets. The results show that the proposed algorithm can capture more semantics and has semantic enhancement function in the search for short texts of microblog. Moreover, the search performance has been significantly improved in the short texts of microblog.
    Related Articles | Metrics
    Variant Entropy Profile: A Multi-Granular Information Model for Data on Things with Order-of-Magnitude Compression Ratios
    Chao Lu, Peng Xiaohui,Xu Zhiwei
    Journal of Computer Research and Development    2018, 55 (8): 1653-1666.   DOI: 10.7544/issn1000-1239.2018.20180219
    Abstract1497)   HTML9)    PDF (4139KB)(521)       Save
    In recent years, the massive produced data by the devices of edges and things has brought new paradigms like edge computing and things computing to apply in the Internet of things, which can optimize the performance and energy consumption by moving the computation tasks to the data source as near as possible. However, innumerous resource-constrained devices of things expose two defects of current paradigms, which are computations cannot be offloaded to the endpoint due to the lack of massive data storage capacity, and the redundant computation and storage for raw data bring overheads due to the lack of multi-granular information support for various application demands. To address these two issues, this article proposes a multi-granular information model for data on things with order-of-magnitude compression ratios, called variant entropy model (VEP), and implements a prototype storage module of TSR-VEP. Evaluations on the real smart meter datasets and benchmarks show that VEP can achieve order-of-magnitude compression ratios and multi-granular information storage and query under low application observed errors. Discussion on the test results demonstrates the feasibility of applying VEP on devices of things and the potential of further optimizing for edge computing and things computing.
    Related Articles | Metrics
    Left-vs-Right Eye Discrimination Based on Convolutional Neural Network
    Zhong Zhiquan, Yuan Jin,Tang Xiaoying
    Journal of Computer Research and Development    2018, 55 (8): 1667-1673.   DOI: 10.7544/issn1000-1239.2018.20180215
    Abstract1746)   HTML22)    PDF (1407KB)(966)       Save
    In this paper, a new method to automatically discriminate the left and right eyes is proposed and validated utilizing a deep convolutional neural network. All parameters of the designed network are automatically estimated based on the characteristic differences between the left and right eye images. On the basis of the Alexnet network, the convolutional neural network designed in this paper consists of four convolutional layers and two fully connected layers, followed by a classifier serving as its last layer. According to our experimental results on a total of 42541 fundus images, the training accuracy of our network is about 100%, and the testing accuracy is as high as 99%. In addition, the proposed network is highly robust given that it successfully works for a large amount of fundus images with great variability. As far as we know, this is the first deep learning based network for left-vs-right eye discrimination that exhibits very high accuracy and precision.
    Related Articles | Metrics
    Predicting Biological Functions of G Protein-Coupled Receptors Based on Fast Multi-Instance Multi-Label Learning
    Wu Jiansheng, Feng Qiaoyu, Yuan Jingzhou, Hu Haifeng, Zhou Jiate,Gao Hao
    Journal of Computer Research and Development    2018, 55 (8): 1674-1682.   DOI: 10.7544/issn1000-1239.2018.20180361
    Abstract1074)   HTML2)    PDF (925KB)(432)       Save
    G protein-coupled receptors (GPCRs) constitute the largest family among human membrane proteins which are the important targets of many drugs on the market. An accurate annotation of the biological functions of GPCR proteins is key to understand their involved biological processes and drug-acting mechanisms. In our previous work, we found that protein function prediction problem can be formulated as a multi-instance multi-label learning (MIML) task. In this paper, we propose a novel method for predicting biological functions of G protein-coupled receptors by using a fast MIML learning called MIMLfast along with a hybrid feature. The hybrid feature consists of amino acid triple information, amino acid correlation information, evolutionary information, secondary structure correlation information, signal peptide information, disordered residue information, physical and chemical properties among GPCR domains. The experimental results show that our method achieves good performance which is superior to state-of-the-art multi-instance multi-label learning methods, multi-label learning methods and CAFA protein function prediction methods.
    Related Articles | Metrics
    U-Statistics and Ensemble Learning Based Method for Gene-Gene Interaction Detection
    Guo Yingjie, Liu Xiaoyan, Wu Chenxi, Guo Maozu,Li Ao
    Journal of Computer Research and Development    2018, 55 (8): 1683-1693.   DOI: 10.7544/issn1000-1239.2018.20180365
    Abstract929)   HTML6)    PDF (3366KB)(553)       Save
    In qualitative genome-wide association studies (GWAS), most existing methods make strong assumptions on the interaction form between genes which limited their power. Lately, many methods for detecting gene-gene interaction have been developed, and among them, the gene-based methods have grown in popularity as they confer an advantage in both statistical power and biological interpretability. In this paper, we propose a hypothesis testing framework for gene-based gene-gene interaction detection based on U statistics and tree-based ensemble learners (GBUtrees). We construct a statistic that detects the deviation from the additive structure in the prediction of log odds ratio of a qualitative trait from each base learner, then average it for learners trained using different subsamples to turn it into the form of U statistics. GBUtrees benefits from both the non-linear modeling power of tree-based ensemble model and the asymptotic normality of U statistics. Our method makes no assumption on the form of interaction, which strengthens its power for detecting different kinds of interactions. Based on simulated studies of eight disease models and real data from the RA pathway in WTCCC dataset, we conclude that it is effective in detecting different kinds of interactions and can be useful for genome-wide association studies.
    Related Articles | Metrics
    A Deep Q-Network Method Based on Upper Confidence Bound Experience Sampling
    Zhu Fei, Wu Wen, Liu Quan,Fu Yuchen
    Journal of Computer Research and Development    2018, 55 (8): 1694-1705.   DOI: 10.7544/issn1000-1239.2018.20180148
    Abstract1619)   HTML12)    PDF (3221KB)(648)       Save
    Recently, deep reinforcement learning (DRL), which combines deep learning (DL) with reinforcement learning (RL) together, has become a hot topic in the field of artificial intelligence. Deep reinforcement learning has made a great breakthrough in the task of optimal policy solving with high dimensional inputs. To remove the temporary correlation among the observed transitions, deep Q-network uses a sampling mechanism called experience replay that replays transitions at random from the memory buffer, which breaks the relationship among samples. However, random sampling doesn’t consider the priority of sample’s transition in the memory buffer. As a result, it is likely to sample data with insignificant information excessively while ignoring informative samples during the process of network training, which leads to longer training time as well as unsatisfactory training effect. To solve this problem, we introduce the idea of priority to traditional deep Q-network and put forward a prioritized sampling algorithm based on upper confidence bound (UCB). It determines sample’s probability of being selected in memory buffer by reward, time step, and sampling times. The proposed approach assigns samples that haven’t been chosen, samples that are more valuable, and samples that have good results, with higher probability of being selected, which guarantees the diversity of samples, such that the agent is able to select action more effectively. Finally, simulation experiments of Atari 2600 games verify the approach.
    Related Articles | Metrics
    A Reinforcement Learning Algorithm for Traffic Offloading in Dense Heterogeneous Network
    Wang Qian, Nie Xiushan,Yin Yilong
    Journal of Computer Research and Development    2018, 55 (8): 1706-1716.   DOI: 10.7544/issn1000-1239.2018.20180310
    Abstract1355)   HTML9)    PDF (2327KB)(680)       Save
    With the explosive growth of numbers of Internet users and network traffic, the capacity of cellular mobile communication is already limited. In order to solve the contradiction between the increasing demand for high capacity and the limited resource, traffic offloading technology makes full use of the existing network, which offloads part of traffic from the cellular network into the other network and carries on the cooperation between networks, to improve the capacity of the cellular network greatly. Traffic offloading becomes one of the hot topics in the future research of wireless communication technology. In this paper, based on reinforcement learning, we propose a novel reinforcement learning algorithm for traffic offloading in dense heterogeneous network. Based on the previous experience and performance gain of the user offloading, this algorithm considers the system throughput of each state, and finds the optimal WiFi network access point (AP) by calculating the reward value. We also derive the optimal policy of traffic offloading decision to maximize the throughput of the system. Simulation results show that the reinforcement learning for traffic offloading can effectively avoid the collision caused by over offloading and rapid deterioration of system performance. Our scheme can effectively implement the adaptive traffic offloading control policy and achieve the cooperation between LTE and WiFi network guaranteeing the quality of service for users. The overall throughput of the dense heterogeneous network also reaches the maximum.
    Related Articles | Metrics
    A Measurable Bayesian Network Structure Learning Method
    Qi Xiaolong, Gao Yang, Wang Hao, Song Bei, Zhou Chunlei,Zhang Youwei
    Journal of Computer Research and Development    2018, 55 (8): 1717-1725.   DOI: 10.7544/issn1000-1239.2018.20180197
    Abstract1307)   HTML9)    PDF (1062KB)(3099)       Save
    In this paper, a Bayesian network structure learning method via variable ordering based on mutual information (BNS\+{vo}-learning) is presented, which includes two components: the metric information matrix learning and the “lazy” heuristic strategy. The matrix of measurement information characterizes the degree of dependency among variables and implies the degree of strength comparison, which effectively solves the problem of misjudgment due to order of variables in the independence test process. Under the guidance of metric information matrix, the “lazy” heuristic strategy selectively adds variables to the condition set in order to effectively reduce high-order tests and reduce the number of tests. We theoretically prove the reliability of the new method and experimentally demonstrate that the new method searches significantly faster than other search processes. And BNS\+{vo}-learning is easily extended to small and sparse data sets without losing the quality of the learning structure.
    Related Articles | Metrics
    Linguistic-Valued Intuitionistic Fuzzy Concept Lattice and Its Application
    Zou Li, Feng Kaihua,Liu Xin
    Journal of Computer Research and Development    2018, 55 (8): 1726-1734.   DOI: 10.7544/issn1000-1239.2018.20180240
    Abstract995)   HTML6)    PDF (1169KB)(604)       Save
    Natural language is always used to reason or make decision. For the uncertain problems represented by linguistic values, inspired by the linguistic value intuitionistic fuzzy algebra and the intuitionistic fuzzy formal context, the linguistic-valued intuitionistic fuzzy formal context is proposed and its related properties are analyzed. Accordingly, the linguistic-valued intuitionistic fuzzy concept lattice is constructed. Based on the degree of similarity between two linguistic-valued intuitionistic fuzzy concepts, the degree of similarity between two linguistic-valued intuitionistic fuzzy concept lattices is presented, and the pattern recognition method based on linguistic-valued intuitionistic fuzzy concept lattice is proposed. The effectiveness and practicability of this method are illustrated by an example of diagnosis and recognition of Chinese medicine diseases.
    Related Articles | Metrics
    An Algorithm for Computing Core of Boolean Game
    Wang Bo, Liu Jinglei
    Journal of Computer Research and Development    2018, 55 (8): 1735-1750.   DOI: 10.7544/issn1000-1239.2018.20180360
    Abstract1143)   HTML8)    PDF (2644KB)(472)       Save
    Boolean game is an important framework to compute the solution of multi-agent cooperation, which represents the static agent games scenario based on propositional logic. In Boolean games, every agent uses propositional formulas to represent its goals, so whether its goals can be satisfied depending on the propositional formulas’ truth value. At present, Boolean game is mainly studied from the knowledge representation perspective and solving pure Nash equilibrium, but computing core from coalitional view is not enough. Computing core of Boolean games is the procedure of generating strategy profiles and comparing with strategy profiles. Firstly, in order to solve the solution of constraint satisfaction problem of Boolean games, we structure hypergraph on Boolean games where the action variables are regarded as the vertices and the goals are regarded as the hyperedge. Secondly, we see agents as the vertices and the dependence relationship between agents as edge to structure a directed dependency graph, which can get a set of stable sets to decompose Boolean games as smaller Boolean games on the scale. These two structures simplify the generation process and comparison process, and then improve the efficiency of computing core of Boolean games to some extent. Then, based on the hypertree decomposition and the stable set decomposition of the dependent graph, we give different methods of computing core of Boolean game. Finally, we verify the validity of this algorithm through the experimental evaluation.
    Related Articles | Metrics
    Exploiting Label Relationships in Multi-Label Classification with Neural Networks
    Song Pan, Jing Liping
    Journal of Computer Research and Development    2018, 55 (8): 1751-1759.   DOI: 10.7544/issn1000-1239.2018.20180362
    Abstract1703)   HTML17)    PDF (2022KB)(869)       Save
    Multi-label learning is critical in many real world application domains including text classification, image annotation, video semantic annotation, gene function analysis, etc. Recently, multi-label learning has attracted intensive attention and generated a hot research topic in machine learning community. However, the existing methods do not adequately address two key challenges: exploiting correlations between labels and making up for the lack of labeled data or even missing labels. A NN_AD_Omega model via neural network for exploring labels dependencies is proposed to handle these two challenges efficiently. NN_AD_Omega model introduces an Omega matrix in the top layer of the neural network to characterize the labels dependencies. As a good by-product, the learnt label correlations have ability to improve prediction performance when the instances’ partial labels are missing because they can capture the intrinsic structure among data. In order to solve the model efficiently, we use the mini-batch gradient descent (Mini-batch-GD) method to solve the optimization problem, meanwhile, the AdaGrad technique is adopted to adaptively search the learning rate. Experiments on four real multi-label datasets demonstrate that the proposed method can exploit the label correlations and handle the missing label data, and obtain promising and better label prediction results than the state-of-the-art neural network based multi-label learning methods.
    Related Articles | Metrics
    Fast Low-Rank Shared Dictionary Learning with Sparsity Constraints on Face Recognition
    Tian Ze, Yang Ming,Li Aishi
    Journal of Computer Research and Development    2018, 55 (8): 1760-1772.   DOI: 10.7544/issn1000-1239.2018.20180364
    Abstract1232)   HTML10)    PDF (2672KB)(494)       Save
    Dictionary learning is one of the most important feature representation methods. It has a wide range of applications in face recognition and other aspects. It is particularly suitable for solving face recognition problems under the change of pose, and has attracted much attention from many researchers. In order to enhance the discriminative ability of dictionary, researchers have put forward a large number of dictionary learning models in combination with domain knowledge and anti-noise strategies, including the recently proposed methods for simultaneous dimensionality reduction and dictionary learning, but these methods focus on the specific-class samples and fail to consider the sharing information between training samples. Therefore, we propose a fast low-rank shared dictionary learning with sparsity constraints approach. The method learns dimensionality reduction and dictionary jointly, and embeds Fisher discriminant criteria to obtain specific-class dictionary and coding coefficients. At the same time, we enforce a low-rank constraint to obtain the low-rank shared dictionary to enhance the discriminative ability of dictionary and coding coefficients. In addition, the Cayley transform is used to protect the orthogonality of the projection matrix to catch a compact feature set. Face recognition experiments on AR, Extended Yale B, CMU PIE, and FERET datasets demonstrate the superiority of our approach. The experimental results show that the proposed method has strong robustness to face recognition under facial expression changes, and plays an inhibitory role in lighting. It is especially suitable for solving small sample problems under illumination and expression changes.
    Related Articles | Metrics
    Semantical Symbol Mapping Embedding Learning Algorithm for Knowledge Graph
    Yang Xiaohui, Wan Rui, Zhang Haibin, Zeng Yifu,Liu Qiao
    Journal of Computer Research and Development    2018, 55 (8): 1773-1784.   DOI: 10.7544/issn1000-1239.2018.20180248
    Abstract1870)   HTML27)    PDF (2451KB)(924)       Save
    Learning graph embedding is a crucial research issue in the field of statistical relational learning and knowledge graph population, and it is important for the construction and application of knowledge graph in recent years. In this paper, we perform a comparative study of the prevalent knowledge representation based reasoning models, with detailed discussion of the general potential problems contained in their basic assumptions. A semantical symbol sensory projection based neural network model is proposed in order to learn graph embedding, whose basic idea is to utilize the recurrent neural network for encoding the compositional representation of symbol strings (composition of entity-relation) onto their target grounded symbol according to the existing relational data in knowledge. In addition, we introduce the inverse image of the relations into the system to deal with the symmetricasymmetric properties of the relations, which makes the proposed model more adaptable to different types of reasoning tasks on a variety of homogeneous and heterogeneous networks than other solutions. The proposed model is suitable for large scale knowledge graph representation learning. Experimental results on benchmark datasets show that the proposed model achieves state-of-the-art performance on both of the knowledge based completion benchmark tests and the graph based multi-label classification tasks.
    Related Articles | Metrics