ISSN 1000-1239 CN 11-1777/TP


    Default Latest Most Read
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Journal of Computer Research and Development    2017, 54 (6): 1131-1132.  
    Abstract1014)   HTML3)    PDF (1040KB)(719)       Save
    Related Articles | Metrics
    Satisfaction Prediction of Web Search Users
    Liu Yiqun
    Journal of Computer Research and Development    2017, 54 (6): 1133-1143.   DOI: 10.7544/issn1000-1239.2017.20160804
    Abstract1818)   HTML15)    PDF (5359KB)(1021)       Save
    User satisfaction is one of the prime concerns for Web search related studies. It is a non-trivial task for three major reasons: 1) Traditional approaches for search performance evaluation mainly rely on editorial judgments of the relevance of search results. The relationship between search satisfaction and relevance-based evaluation still remains under-investigated. 2) Most existing researches are based on the hypothesis that all results on search result pages (SERPs) are homogeneous while a variety of heterogeneous components have been aggregated into modern SERPs to improve search performance. 3) Most existing studies on satisfaction prediction primarily rely on users’ click-through and query reformulation behaviors but there are plenty of search sessions without such information. In this paper, we summarize our recent efforts to shed light on these research questions. Firstly, we perform a laboratory study to investigate the relationship between relevance and users’ perceived usefulness and satisfaction. After that, we also investigate the impact of vertical results with different qualities, presentation styles and positions on search satisfaction with specifically designed SERPs. Finally, inspired by recent studies in predicting result relevance based on mouse movement patterns, we propose novel strategies to extract high quality mouse movement patterns from SERPs for satisfaction prediction. Experimental results show that our proposed method outperforms existing approaches in heterogeneous search environment.
    Related Articles | Metrics
    Recent Advances in Neural Machine Translation
    Liu Yang
    Journal of Computer Research and Development    2017, 54 (6): 1144-1149.   DOI: 10.7544/issn1000-1239.2017.20160805
    Abstract2440)   HTML43)    PDF (3970KB)(1862)       Save
    Machine translation, which aims at automatically translating between natural languages using computers, is one of important research directions in artificial intelligence and natural language processing. Recent years have witnessed the rapid development of neural machine translation, which has replaced conventional statistical machine translation to become the new mainstream technique in both academia and industry. This paper first introduces the basic ideas and state-of-the-art approaches in neural machine translation and then reviews recent important research findings. The paper concludes with a discussion about possible future directions.
    Related Articles | Metrics
    A Survey on Sentiment Classification
    Chen Long, Guan Ziyu, He Jinhong, Peng Jinye
    Journal of Computer Research and Development    2017, 54 (6): 1150-1170.   DOI: 10.7544/issn1000-1239.2017.20160807
    Abstract2566)   HTML39)    PDF (9364KB)(1960)       Save
    Sentiment analysis in text is an important research field for intelligent multimedia understanding. The aim of sentiment classification is to predict the sentiment polarity of opinionated text, which is the core of sentiment analysis. With rapid growth of online opinionated content, the traditional approaches such as lexicon-based methods and classic machine learning methods cannot well handle large-scale sentiment classification problems. In recent years, deep learning has achieved good performance on the intelligent understanding of large-scale text data and has attracted a lot of attention. More and more researchers start to address text classification problems with deep learning. The content of this survey is organized as two parts. We firstly summarize the traditional approaches including lexicon-based methods, machine learning based methods, hybrid methods, methods based on weakly labeled data and deep learning based methods. Secondly, we introduce our proposed weakly-supervised deep learning framework to deal with the defects of the previous approaches. Moreover, we briefly summarize the research work on the extraction of opinion aspects. Finally, we discuss the challenges and future work on sentiment classification.
    Related Articles | Metrics
    Label Enhancement for Label Distribution Learning
    Geng Xin, Xu Ning, Shao Ruifeng
    Journal of Computer Research and Development    2017, 54 (6): 1171-1184.   DOI: 10.7544/issn1000-1239.2017.20170002
    Abstract2394)   HTML21)    PDF (4492KB)(1390)       Save
    Multi-label learning (MLL) deals with the case where each instance is associated with multiple labels. Its target is to learn the mapping from instance to relevant label set. Most existing MLL methods adopt the uniform label distribution assumption, i.e., the importance of all relevant (positive) labels is the same for the instance. However, for many real-world learning problems, the importance of different relevant labels is often different. For this issue, label distribution learning (LDL) has achieved good results by modeling the different importance of labels with a label distribution. Unfortunately, many datasets only contain simple logical labels rather than label distributions. To solve the problem, one way is to transform the logical labels into label distributions by mining the hidden label importance from the training examples, and then promote prediction precision via label distribution learning. Such process of transforming logical labels into label distributions is defined as label enhancement for label distribution learning. This paper first proposes the concept of label enhancement with a formal definition. Then, existing algorithms that can be used for label enhancement have been surveyed, and compared in the experiments. Results of the experiments reveal that label enhancement can effectively discover the difference of the label importance hidden in the data, and improve the performance of multi-label learning.
    Related Articles | Metrics
    Probability Distribution Based Evolutionary Computation Algorithms for Multimodal Optimization
    Chen Weineng, Yang Qiang
    Journal of Computer Research and Development    2017, 54 (6): 1185-1197.   DOI: 10.7544/issn1000-1239.2017.20160891
    Abstract1687)   HTML10)    PDF (6078KB)(1162)       Save
    Evolutionary computation (EC) is a category of algorithms which simulate the intelligent evolutionary behavior in nature for solving optimization problems. As EC algorithms do not rely on the mathematical characteristics of the problem model, they have been regarded as an important tool for complex optimization. Estimation of distribution algorithm (EDA) is a new class of EC algorithms, which works by constructing a probability model to estimate the distribution of the predominant individuals in the population, and sampling new individuals based on the probability model. With this probability-based search behavior, EDA is good at maintaining sufficient search diversity, and is applicable in both continuous and discrete search space. In order to promote the research of probability-based EC (PBEC) algorithms, this paper gives a survey on EC algorithms for multimodal optimization, and then further builds two frameworks for PBEC: PBEC framework for seeking multiple solutions in multimodal optimization, and PBEC framework for discrete optimization. The first framework presents a method to combine probability-based evolutionary operators with the niching strategy, so that higher search diversity can be maintained for seeking multiple solutions in multimodal optimization. In particular, the framework understands PBEC algorithms in a broad sense, that is, it allows both explicit PBEC algorithms (e.g. EDA) and implicit PBEC algorithms (e.g. ant colony optimization) to operate in the framework, resulting in two representative algorithms: multimodal EDA (MEDA) and adaptive multimodal ant colony optimization (AM-ACO). The second framework aims at extending the applicability of EC algorithms on both continuous and discrete space. Since some popular EC algorithms are originally defined on continuous real vector space and they cannot be directly used to solve discrete optimization problems, this framework introduces the idea of probability distribution based evolution and redefines their evolutionary operators on discrete set space. As a result, the applicability of these algorithms can be significantly improved.
    Related Articles | Metrics
    Survey of Database Usability for Query Results
    Liu Qing, Gao Yunjun
    Journal of Computer Research and Development    2017, 54 (6): 1198-1212.   DOI: 10.7544/issn1000-1239.2017.20160806
    Abstract1311)   HTML2)    PDF (5085KB)(1051)       Save
    Database usability has received much attention in the database community because of its importance. The goal of database usability is to help users utilize database more efficiently and conveniently, and thus improving the user’s satisfaction for the database. In this survey, we focus on the database usability for query results. Currently, the queries only return the query results to users. If the query result is unexpected for the users, it will frustrate users. However, the database system neither gives explanations for the unexpected query results, nor offers any suggestion on how to get the expected results for users. The users only can debug the queries by themselves, which is cumbersome and time-consuming. If the database system can offer such explanations and suggestions, it helps the users understand initial query better, and know how to change the query until the satisfactory results are found, hence improving the usability of the database. Towards this, the studies on unexpected query results have been explored. In this paper, we provide a comprehensive survey of the most recent research on database usability for query results. The paper first analyses the unexpected query results, and introduces the corresponding three problems, i.e., causality & responsibility, why-not & why questions, and why-few & why-many questions, and highlights the importance of these three problems. Then, the state of the art progresses of the unexpected query result research have been surveyed and summarized. Finally, the paper raises some directions for the future work.
    Related Articles | Metrics
    A Survey of Distributed RDF Data Management
    Zou Lei, Peng Peng
    Journal of Computer Research and Development    2017, 54 (6): 1213-1224.   DOI: 10.7544/issn1000-1239.2017.20160908
    Abstract1795)   HTML11)    PDF (5363KB)(1204)       Save
    Recently, RDF (resource description framework) has been widely used to expose, share, and connect pieces of data on the Web, while SPARQL (simple protocol and RDF query language) is a structured query language to access RDF repository. As RDF datasets increase in size, evaluating SPARQL queries over current RDF repositories is beyond the capacity of a single machine. As a result, a high performance distributed RDF database system is needed to efficiently evaluate SPARQL queries. There are a huge number of works for distributed RDF data management following different approaches. In this paper we provide an overview of these works. This survey considers three kinds of distributed data management approaches, including cloud-based distributed data management approaches, partitioning-based distributed data management approaches and federated RDF systems. Simply speaking, cloud-based distributed data management approaches use existing cloud platforms to manage large RDF datasets; partitioning-based distributed data management approaches divide an RDF graph into several fragments and place each fragment at a different site in a distributed system; and federated RDF systems disallow for re-partitioning the data, since the data has been distributed over their own autonomous sites. In each kind of distributed data management approaches, further discussions are also provided to help readers to understand the characteristics of different approaches.
    Related Articles | Metrics
    High-Throughput Image and Video Computing
    Tang Jinhui, Li Zechao, Liu Shaoli, Qin Lei
    Journal of Computer Research and Development    2017, 54 (6): 1225-1237.   DOI: 10.7544/issn1000-1239.2017.20170001
    Abstract1579)   HTML8)    PDF (3639KB)(842)       Save
    In recent years, image and video data grows and spreads rapidly in the Internet. The data not only has huge amount, but also has the characteristics of high concurrency, high dimension and high throughput, which brings huge challenges into the real-time analysis and processing of them. To promote the image and video data processing efficiency of big data platforms, it is necessary and important to study the task of high-throughput image and video computing, and propose a series of high-throughput image and video computing theories and methods by considering the new hardware structures. Towards this end, this work first overviews previous high-throughput image and video computing theories and methods in details, and then discusses the disadvantages of the existing high-throughput image and video computing methods. Furthermore, this work analyzes three research directions of the high-throughput image and video computing task in future: the high-throughput image and video computing theories, the high-throughput image and video analysis methods, and the high-throughput video coding methods. Finally, this work introduces three key scientific problems of high-throughput image and video computing. The solutions of these problems will provide key technical support for the applications of content monitoring of Internet images and videos, the large-scale video surveillance, and the image and video search.
    Related Articles | Metrics
    Video Copy Detection Method: A Review
    Gu Jiawei, Zhao Ruiwei, Jiang Yugang
    Journal of Computer Research and Development    2017, 54 (6): 1238-1250.   DOI: 10.7544/issn1000-1239.2017.20170003
    Abstract1910)   HTML13)    PDF (5737KB)(1133)       Save
    Currently, there exist large amount of copy videos on the Internet. To identify these videos, researchers have been working on the study of video copy detection methods for a long time. In recent years, a few new video copy detection algorithms have been proposed with the introduction of deep learning. In this article, we provide a review on the existing representative video copy detection methods. We introduce the general framework of video copy detection system as well as the various implementation choices of its components, including feature extraction, indexing, feature matching and time alignment. The discussed approaches include the latest deep learning based methods, mainly the application of deep convolutional neural networks and siamese convolutional neural networks in video copy detection system. Furthermore, we summarize the evaluation criteria used in video copy detection and discuss the performance of some representative methods on five popular datasets. In the end, we envision future directions on this topic.
    Related Articles | Metrics
    The Semantic Knowledge Embedded Deep Representation Learning and Its Applications on Visual Understanding
    Zhang Ruimao, Peng Jiefeng, Wu Yang, Lin Liang
    Journal of Computer Research and Development    2017, 54 (6): 1251-1266.   DOI: 10.7544/issn1000-1239.2017.20171064
    Abstract2100)   HTML15)    PDF (12595KB)(1505)       Save
    With the rapid development of deep learning technique and large scale visual datasets, the traditional computer vision tasks have achieved unprecedented improvement. In order to handle more and more complex vision tasks, how to integrate the domain knowledge into the deep neural network and enhance the ability of deep model to represent the visual pattern, has become a widely discussed topic in both academia and industry. This thesis engages in exploring effective deep models to combine the semantic knowledge and feature learning. The main contributions can be summarized as follows: 1)We integrate the semantic similarity of visual data into the deep feature learning process, and propose a deep similarity comparison model named bit-scalable deep hashing to address the issue of visual similarity comparison. The model in this thesis has achieved great performance on image searching and people’s identification. 2)We also propose a high-order graph LSTM (HG-LSTM) networks to solve the problem of geometric attribute analysis, which realizes the process of integrating the multi semantic context into the feature learning process. Our extensive experiments show that our model is capable of predicting rich scene geometric attributes and outperforming several state-of-the-art methods by large margins. 3)We integrate the structured semantic information of visual data into the feature learning process, and propose a novel deep architecture to investigate a fundamental problem of scene understanding: how to parse a scene image into a structured configuration. Extensive experiments show that our model is capable of producing meaningful and structured scene configurations, and achieving more favorable scene labeling result on two challenging datasets compared with other state-of-the-art weakly-supervised deep learning methods.
    Related Articles | Metrics
    Query and Feedback Technologies in Multimedia Information Retrieval
    Zha Zhengjun, Zheng Xiaoju
    Journal of Computer Research and Development    2017, 54 (6): 1267-1280.   DOI: 10.7544/issn1000-1239.2017.20170004
    Abstract1501)   HTML15)    PDF (6830KB)(1008)       Save
    In spite of the remarkable progress made in the past decades, multimedia information retrieval still suffers from the “intention gap” and “semantic gap”. To address this issue, researchers have proposed a wealth of query technologies to help user express search intent clearly as well as feedback technologies to help retrieval system understand user intent and multimedia data accurately, leading to significant improvements of retrieval performance. This paper presents a survey of the query and feedback technologies in multimedia information retrieval. We summarize the evolution of query styles and the development of feedback approaches. We elaborate the query approaches for retrieval on PC, mobile intelligent devices and touch-screen devices etc. We introduce the feedback approaches proposed in different periods and discuss the interaction issue in exploratory multimedia retrieval. Finally, we discuss future research directions in this field.
    Related Articles | Metrics
    The Construction, Analysis, and Applications of Dynamic Protein-Protein Interaction Networks
    Li Min, Meng Xiangmao
    Journal of Computer Research and Development    2017, 54 (6): 1281-1299.   DOI: 10.7544/issn1000-1239.2017.20160902
    Abstract1666)   HTML5)    PDF (7481KB)(873)       Save
    The rapid development of proteomics and high-throughput technologies, has produced a large amount of protein-protein interaction (PPI) data, which provides a foundation for further understanding the interactions between proteins and the biomedical mechanism of complex diseases. In an organism, a protein-protein interaction network (PIN) consists of all the proteins and their interactions. Most of the traditional studies on PINs are based on static networks. However, due to the dynamics of protein expressions and the dynamics of PPIs, the real PINs change with time and conditions. Protein function modules related with the occurrence and development of diseases are also bound with this dynamic change. Researchers have shifted their attentions from the static properties to dynamic properties, and proposed a series of methods for the construction of dynamic PINs. This paper is to review the construction, analysis and applications of dynamic PINs. Firstly, the existing dynamic PIN construction methods are discussed in three categories: the methods based on dynamic protein expressions, the methods based on multi-state expression and correlation changes and the methods based on spatial-temporal dynamic changes. The first category embodies the protein dynamic expression varying with time; the second category reflects the changes in the expression-related relationship between proteins under different conditions; while the third category describes the dynamic of proteins and the interactions in time and space. Then, the dynamic analysis of the proteins and the related subnetworks of the dynamic PINs are reviewed. Furthermore,the main applications in the complex diseases of dynamic PINs are discussed in details, such as the identification of protein complexes/functional modules, the detection of biomarkers, and the prediction of disease genes, etc. Finally, the challenges and future research directions of dynamic PINs are discussed.
    Related Articles | Metrics
    Realtime Capture of High-Speed Traffic on Multi-Core Platform
    Ling Ruilin, Li Junfeng, Li Dan
    Journal of Computer Research and Development    2017, 54 (6): 1300-1313.   DOI: 10.7544/issn1000-1239.2017.20160823
    Abstract1219)   HTML7)    PDF (9190KB)(1021)       Save
    With the development of Internet application and the increase of network bandwidth, security issues become increasingly serious. In addition to the spread of the virus, spams and DDoS attacks, there have been lots of strongly hidden attack methods. Network probe tools which are deployed as a bypass device at the gateway of the intranet, can collect all the traffic of the current network and analyze them. The most important module of the network probe is packet capture. In Linux network protocol stack, there are many performance bottlenecks in the procedure of packets processing which cannot meet the demand of high speed network environment. In this paper, we introduce several new packet capture engines based on zero-copy and multi-core technology. Further, we design and implement a scalable high performance packet capture framework based on Intel DPDK, which uses RSS (receiver-side scaling) to make packet capture parallelization and customize the packet processing. Additionally, this paper also discusses more effective and fair Hash function by which data packet can be deliveried to different receiving queues. In evaluation, we can see that the system can capture and process the packets in nearly line-speed and balance the load between CPU cores.
    Related Articles | Metrics
    Cited: Baidu(1)