ISSN 1000-1239 CN 11-1777/TP

Most Down Articles

    Published in last 1 year | In last 2 years| In last 3 years| All| Most Downloaded in Recent Month | Most Downloaded in Recent Year|

    Please wait a minute...
    For Selected: Toggle Thumbnails
    Big Data Management: Concepts,Techniques and Challenges
    Meng Xiaofeng and Ci Xiang
    Journal of Computer Research and Development   
    Accepted: 15 January 2020

    Knowledge Graph Construction Techniques
    Journal of Computer Research and Development    2016, 53 (3): 582-600.   DOI: 10.7544/issn1000-1239.2016.20148228
    Abstract17204)   HTML944)    PDF (2414KB)(25207)       Save
    Google’s knowledge graph technology has drawn a lot of research attentions in recent years. However, due to the limited public disclosure of technical details, people find it difficult to understand the connotation and value of this technology. In this paper, we introduce the key techniques involved in the construction of knowledge graph in a bottom-up way, starting from a clearly defined concept and a technical architecture of the knowledge graph. Firstly, we describe in detail the definition and connotation of the knowledge graph, and then we propose the technical framework for knowledge graph construction, in which the construction process is divided into three levels according to the abstract level of the input knowledge materials, including the information extraction layer, the knowledge integration layer, and the knowledge processing layer, respectively. Secondly, the research status of the key technologies for each level are surveyed comprehensively and also investigated critically for the purposes of gradually revealing the mysteries of the knowledge graph technology, the state-of-the-art progress, and its relationship with related disciplines. Finally, five major research challenges in this area are summarized, and the corresponding key research issues are highlighted.
    Related Articles | Metrics
    Knowledge Representation Learning: A Review
    Liu Zhiyuan, Sun Maosong, Lin Yankai, Xie Ruobing
    Journal of Computer Research and Development    2016, 53 (2): 247-261.   DOI: 10.7544/issn1000-1239.2016.20160020
    Abstract12343)   HTML299)    PDF (3333KB)(21056)       Save
    Knowledge bases are usually represented as networks with entities as nodes and relations as edges. With network representation of knowledge bases, specific algorithms have to be designed to store and utilize knowledge bases, which are usually time consuming and suffer from data sparsity issue. Recently, representation learning, delegated by deep learning, has attracted many attentions in natural language processing, computer vision and speech analysis. Representation learning aims to project the interested objects into a dense, real-valued and low-dimensional semantic space, whereas knowledge representation learning focuses on representation learning of entities and relations in knowledge bases. Representation learning can efficiently measure semantic correlations of entities and relations, alleviate sparsity issues, and significantly improve the performance of knowledge acquisition, fusion and inference. In this paper, we will introduce the recent advances of representation learning, summarize the key challenges and possible solutions, and further give a future outlook on the research and application directions.
    Related Articles | Metrics
    Deep Learning: Yesterday, Today, and Tomorrow
    Yu Kai, Jia Lei, Chen Yuqiang, and Xu Wei
    计算机研究与发展    2013, 50 (9): 1799-1804.  
    Abstract8187)   HTML626)    PDF (873KB)(14102)       Save
    Machine learning is an important area of artificial intelligence. Since 1980s, huge success has been achieved in terms of algorithms, theory, and applications. From 2006, a new machine learning paradigm, named deep learning, has been popular in the research community, and has become a huge wave of technology trend for big data and artificial intelligence. Deep learning simulates the hierarchical structure of human brain, processing data from lower level to higher level, and gradually composing more and more semantic concepts. In recent years, Google, Microsoft, IBM, and Baidu have invested a lot of resources into the R&D of deep learning, making significant progresses on speech recognition, image understanding, natural language processing, and online advertising. In terms of the contribution to real-world applications, deep learning is perhaps the most successful progress made by the machine learning community in the last 10 years. In this article, we will give a high-level overview about the past and current stage of deep learning, discuss the main challenges, and share our views on the future development of deep learning.
    Related Articles | Metrics
    Edge Computing: State-of-the-Art and Future Directions
    Shi Weisong, Zhang Xingzhou, Wang Yifan, Zhang Qingyang
    Journal of Computer Research and Development    2019, 56 (1): 69-89.   DOI: 10.7544/issn1000-1239.2019.20180760
    Abstract8561)   HTML1962)    PDF (3670KB)(6865)       Save
    With the burgeoning of the Internet of everything, the amount of data generated by edge devices increases dramatically, resulting in higher network bandwidth requirements. In the meanwhile, the emergence of novel applications calls for the lower latency of the network. It is an unprecedented challenge to guarantee the quality of service while dealing with a massive amount of data for cloud computing, which has pushed the horizon of edge computing. Edge computing calls for processing the data at the edge of the network and develops rapidly from 2014 as it has the potential to reduce latency and bandwidth charges, address the limitation of computing capability of cloud data center, increase availability as well as protect data privacy and security. This paper mainly discusses three questions about edge computing: where does it come from, what is the current status and where is it going? This paper first sorts out the development process of edge computing and divides it into three periods: technology preparation period, rapid growth period and steady development period. This paper then summarizes seven essential technologies that drive the rapid development of edge computing. After that, six typical applications that have been widely used in edge computing are illustrated. Finally, this paper proposes six open problems that need to be solved urgently in future development.
    Related Articles | Metrics
    A Survey on Entity Alignment of Knowledge Base
    Zhuang Yan, Li Guoliang, Feng Jianhua
    Journal of Computer Research and Development    2016, 53 (1): 165-192.   DOI: 10.7544/issn1000-1239.2016.20150661
    Abstract6423)   HTML132)    PDF (3322KB)(6009)       Save
    Entity alignment on knowledge base has been a hot research topic in recent years. The goal is to link multiple knowledge bases effectively and create a large-scale and unified knowledge base from the top-level to enrich the knowledge base, which can be used to help machines to understand the data and build more intelligent applications. However, there are still many research challenges on data quality and scalability, especially in the background of big data. In this paper, we present a survey on the techniques and algorithms of entity alignment on knowledge base in decade, and expect to provide alternative options for further research by classifying and summarizing the existing methods. Firstly, the entity alignment problem is formally defined. Secondly, the overall architecture is summarized and the research progress is reviewed in detail from algorithms, feature matching and indexing aspects. The entity alignment algorithms are the key points to solve this problem, and can be divided into pair-wise methods and collective methods. The most commonly used collective entity alignment algorithms are discussed in detail from local and global aspects. Some important experimental and real world data sets are introduced as well. Finally, open research issues are discussed and possible future research directions are prospected.
    Related Articles | Metrics
    Survey on Privacy Preserving Techniques for Blockchain Technology
    Zhu Liehuang, Gao Feng, Shen Meng, Li Yandong, Zheng Baokun, Mao Hongliang, Wu Zhen
    Journal of Computer Research and Development    2017, 54 (10): 2170-2186.   DOI: 10.7544/issn1000-1239.2017.20170471
    Abstract9545)   HTML452)    PDF (3265KB)(5983)       Save
    Core features of the blockchain technology are “de-centralization” and “de-trusting”. As a distributed ledger technology, smart contract infrastructure platform and novel distributed computing paradigm, it can effectively build programmable currency, programmable finance and programmable society, which will have a far-reaching impact on the financial and other fields, and drive a new round of technological change and application change. While blockchain technology can improve efficiency, reduce costs and enhance data security, it is still in the face of serious privacy issues which have been widely concerned by researchers. The survey first analyzes the technical characteristics of the blockchain, defines the concept of identity privacy and transaction privacy, points out the advantages and disadvantages of blockchain technology in privacy protection and introduces the attack methods in existing researches, such as transaction tracing technology and account clustering technology. And then we introduce a variety of privacy mechanisms, including malicious nodes detection and restricting access technology for the network layer, transaction mixing technology, encryption technology and limited release technology for the transaction layer, and some defense mechanisms for blockchain applications layer. In the end, we discuss the limitations of the existing technologies and envision future directions on this topic. In addition, the regulatory approach to malicious use of blockchain technology is discussed.
    Related Articles | Metrics
    Cited: Baidu(8)
    Survey on Privacy-Preserving Machine Learning
    Liu Junxu, Meng Xiaofeng
    Journal of Computer Research and Development    2020, 57 (2): 346-362.   DOI: 10.7544/issn1000-1239.2020.20190455
    Abstract5619)   HTML294)    PDF (1684KB)(5882)       Save
    Large-scale data collection has vastly improved the performance of machine learning, and achieved a win-win situation for both economic and social benefits, while personal privacy preservation is facing new and greater risks and crises. In this paper, we summarize the privacy issues in machine learning and the existing work on privacy-preserving machine learning. We respectively discuss two settings of the model training process—centralized learning and federated learning. The former needs to collect all the user data before training. Although this setting is easy to deploy, it still exists enormous privacy and security hidden troubles. The latter achieves that massive devices can collaborate to train a global model while keeping their data in local. As it is currently in the early stage of the study, it also has many problems to be solved. The existing work on privacy-preserving techniques can be concluded into two main clues—the encryption method including homomorphic encryption and secure multi-party computing and the perturbation method represented by differential privacy, each having its advantages and disadvantages. In this paper, we first focus on the design of differentially-private machine learning algorithm, especially under centralized setting, and discuss the differences between traditional machine learning models and deep learning models. Then, we summarize the problems existing in the current federated learning study. Finally, we propose the main challenges in the future work and point out the connection among privacy protection, model interpretation and data transparency.
    Related Articles | Metrics
    Recent Advances in Bayesian Machine Learning
    Zhu Jun,Hu Wenbo
    Journal of Computer Research and Development    2015, 52 (1): 16-26.   DOI: 10.7544/issn1000-1239.2015.20140107
    Abstract4604)   HTML129)    PDF (2137KB)(5099)       Save
    With the fast growth of big data, statistical machine learning has attracted tremendous attention from both industry and academia, with many successful applications in vision, speech, natural language, and biology. In particular, the last decades have seen the fast development of Bayesian machine learning, which is now representing a very important class of techniques. In this article, we provide an overview of the recent advances in Bayesian machine learning, including the basics of Bayesian machine learning theory and methods, nonparametric Bayesian methods and inference algorithms, and regularized Bayesian inference. Finally, we also highlight the challenges and recent progress on large-scale Bayesian learning for big data, and discuss on some future directions.
    Related Articles | Metrics
    Survey of Internet of Things Security
    Zhang Yuqing, Zhou Wei, Peng Anni
    Journal of Computer Research and Development    2017, 54 (10): 2130-2143.   DOI: 10.7544/issn1000-1239.2017.20170470
    Abstract5162)   HTML218)    PDF (1747KB)(4977)       Save
    With the development of smart home, intelligent care and smart car, the application fields of IoT are becoming more and more widespread, and its security and privacy receive more attention by researchers. Currently, the related research on the security of the IoT is still in its initial stage, and most of the research results cannot solve the major security problem in the development of the IoT well. In this paper, we firstly introduce the three-layer logic architecture of the IoT, and outline the security problems and research priorities of each level. Then we discuss the security issues such as privacy preserving and intrusion detection, which need special attention in the IoT main application scenarios (smart home, intelligent healthcare, car networking, smart grid, and other industrial infrastructure). Though synthesizing and analyzing the deficiency of existing research and the causes of security problem, we point out five major technical challenges in IoT security. They are privacy protection in data sharing, the equipment security protection under limited resources, more effective intrusion detection and defense systems and method, access control of equipment automation operations and cross-domain authentication of motive device. We finally detail every technical challenge and point out the IoT security research hotspots in future.
    Related Articles | Metrics
    Cited: Baidu(13)
    Quantum Annealing Algorithms: State of the Art
    Du Weilin, Li Bin, and Tian Yu
    Journal of Computer Research and Development    2008, 45 (9): 1501-1508.  
    Abstract2343)   HTML50)    PDF (1382KB)(4907)       Save
    In mathematics and applications, quantum annealing is a new method for finding solutions to combinatorial optimization problems and ground states of glassy systems using quantum fluctuations. Quantum fluctuations can be simulated in computers using various quantum Monte Carlo techniques, such as the path integral Monte Carlo method, and thus they can be used to obtain a new kind of heuristic algorithm for global optimization. It can be said that the idea of quantum annealing comes from the celebrated classical simulated thermal annealing invented by Kirkpatrick. However, unlike a simulated annealing algorithm, which utilizes thermal fluctuations to help the algorithm jump from local optimum to global optimum, quantum annealing algorithms utilize quantum fluctuations to help the algorithm tunnel through the barriers directly from local optimum to global optimum. According to the previous studies, although the quantum annealing algorithm is not capable, in general, of finding solutions to NP-complete problems in polynomial time, quantum annealing is still a promising optimization technique, which exhibits good performances on some typical optimization problems, such as the transverse Ising model and the traveling salesman problem. Provided in this paper is an overview of the principles and research progresses of quantum annealing algorithms in recent years; several different kinds of quantum annealing algorithms are presented in detail; both the advantages and disadvantages of each algorithm are analyzed; and prospects for the research orientation of the quantum annealing algorithm in future are given.
    Related Articles | Metrics
    Research Review of Knowledge Graph and Its Application in Medical Domain
    Hou Mengwei, Wei Rong, Lu Liang, Lan Xin, Cai Hongwei
    Journal of Computer Research and Development    2018, 55 (12): 2587-2599.   DOI: 10.7544/issn1000-1239.2018.20180623
    Abstract6215)   HTML250)    PDF (2825KB)(4628)       Save
    With the advent of the medical big data era, knowledge interconnection has received extensive attention. How to extract useful medical knowledge from massive data is the key for medical big data analysis. Knowledge graph technology provides a means to extract structured knowledge from massive texts and images.The combination of knowledge graph, big data technology and deep learning technology is becoming the core driving force for the development of artificial intelligence. The knowledge graph technology has a broad application prospect in the medical domain. The application of knowledge graph technology in the medical domain will play an important role in solving the contradiction between the supply of high-quality medical resources and the continuous increase of demand for medical services.At present, the research on medical knowledge graph is still in the exploratory stage. The existing knowledge graph technology generally has several problems such as low efficiency, multiple restrictions and poor expansion in the medical domain. This paper firstly analyzes the medical knowledge graph architecture and construction technology for the strong professionalism and complex structure of big data in the medical domain. Secondly, the key technologies and research progress of the three modules of knowledge extraction, knowledge expression, knowledge fusion and knowledge reasoning in medical knowledge map are summarized. In addition, the application status of medical knowledge maps in clinical decision support, medical intelligence semantic retrieval, medical question answering system and other medical services are introduced. Finally, the existing problems and challenges of current research are discussed and analyzed, and its development is prospected.
    Related Articles | Metrics
    Towards Measuring Unobservability in Anonymous Communication Systems
    Tan Qingfeng, Shi Jinqiao, Fang Binxing, Guo Li, Zhang Wentao, Wang Xuebin, Wei Bingjie
    Journal of Computer Research and Development    2015, 52 (10): 2373-2381.   DOI: 10.7544/issn1000-1239.2015.20150562
    Abstract11252)   HTML121)    PDF (6861KB)(4589)       Save
    Anonymous communication technique is one of the main privacy-preserving techniques, which has been widely used to protect Internet users’ privacy. However, existing anonymous communication systems are particularly vulnerable to traffic analysis, and researchers have been improving unobservability of systems against Internet censorship and surveillance. However, how to quantify the degree of unobservability is a key challenge in anonymous communication systems. We model anonymous communication systems as an alternating turing machine, and analyze adversaries’ threat model. Based on this model, this paper proposes a relative entropy approach that allows to quantify the degree of unobservability for anonymous communication systems. The degree of unobservability is based on the probabilities of the observed flow patterns by attackers. We also apply this approach to measure the pluggable transports of TOR, and show how to calculate it for comparing the level of unobservability of these systems. The experimental results show that it is useful to evaluate the level of unobservability of anonymous communication systems. Finally, we present the conclusion and discuss future work on measuring unobservability in anonymous communication systems.
    Related Articles | Metrics
    Survey of Data-Centric Smart City
    Wang Jingyuan, Li Chao, Xiong Zhang, and Shan Zhiguang
    Journal of Computer Research and Development   
    Review of Entity Relation Extraction Methods
    Li Dongmei, Zhang Yang, Li Dongyuan, Lin Danqiong
    Journal of Computer Research and Development    2020, 57 (7): 1424-1448.   DOI: 10.7544/issn1000-1239.2020.20190358
    Abstract4513)   HTML198)    PDF (1404KB)(4495)       Save
    There is a phenomenon that information extraction has long been concerned by a lot of research works in the field of natural language processing. Information extraction mainly includes three sub-tasks: entity extraction, relation extraction and event extraction, among which relation extraction is the core mission and a great significant part of information extraction. Furthermore, the main goal of entity relation extraction is to identify and determine the specific relation between entity pairs from plenty of natural language texts, which provides fundamental support for intelligent retrieval, semantic analysis, etc, and improves both search efficiency and the automatic construction of the knowledge base. Then, we briefly expound the development of entity relation extraction and introduce several tools and evaluation systems of relation extraction in both Chinese and English. In addition, four main methods of entity relation extraction are mentioned in this paper, including traditional relation extraction methods, and other three methods respectively based on traditional machine learning, deep learning and open domain. What is more important is that we summarize the mainstream research methods and corresponding representative results in different historical stages, and conduct contrastive analysis concerning different entity relation extraction methods. In the end, we forecast the contents and trend of future research.
    Related Articles | Metrics
    Outliers and Change-Points Detection Algorithm for Time Series
    Su Weixing, Zhu Yunlong, Liu Fang, and Hu Kunyuan
    Journal of Computer Research and Development   
    Using Maximum Entropy Model for Chinese Text Categorization
    Li Ronglu, Wang Jianhui, Chen Xiaoyun, Tao Xiaopeng, and Hu Yunfa
    Journal of Computer Research and Development    2005, 42 (1): 94-101.  
    Abstract4133)   HTML87)    PDF (409KB)(4324)       Save
    With the rapid development of World Wide Web, text classification has become the key technology in organizing and processing large amount of document data. Maximum entropy model is a probability estimation technique widely used for a variety of natural language tasks. It offers a clean and accommodable frame to combine diverse pieces of contextual information to estimate the probability of a certain linguistics phenomena. This approach for many tasks of NLP perform near state-of-the-art level, or outperform other competing probability methods when trained and tested under similar conditions. However, relatively little work has been done on applying maximum entropy model to text categorization problems. In addition, no previous work has focused on using maximum entropy model in classifying Chinese documents. Maximum entropy model is used for text categorization. Its categorization performance is compared and analyzed using different approaches for text feature generation, different number of feature and smoothng technique. Moreover, in experiments it is compared to Bayes, KNN and SVM, and it is shown that its performance is higher than Bayes and comparable with KNN and SVM. It is a promising technique for text categorization.
    Related Articles | Metrics
    Stock Network Community Detection Method Based on Influence Calculating Model
    Wang Hao, Li Guohuan, Yao Hongliang, Li Junzhao
    Journal of Computer Research and Development    2014, 51 (10): 2137-2147.   DOI: 10.7544/issn1000-1239.2014.20130575
    Abstract1550)   HTML55)    PDF (3150KB)(4302)       Save
    Taking advantage of the energy characteristics of complex system, a concept of influence is introduced to research community detection method, so that community structure could be discovered effectively. With regard to the stock closing price, by introducing the definition of influence and node centrality, a stock network is construted with influence which is regarded as the edge weight. This paper proposes an algorithm named stock network hierarchical clustering based on the influence calculating model, which is referred to as BCNHC algorithm. Firstly, BCNHC algorithm introduces the definition of nodes’ activity and influence, and puts forward the influence calculating model of node in networks in addition. Then, on the basis of measure criterion of the node centrality, the nodes with large node centrality value as the center nodes are selected, and the nodes’ Intimacy and influence model are utilized to ensure the influence of association between neighbor nodes. Furthermore, the node with minimum degree is gathering toward to center nodes, so as to reduce the error clustering caused by the uncertainty of which community neighbor nodes belong to. On the basis, the neighbor communities are clustered with the average influence of association of communities. It guarantees that influence of association reach to maximization for all the nodes in the community, until the entire networks’ modularity come to maximum. At last, comparison and analysis of experimental on stock network prove the feasibility of BCNHC algorithm.
    Related Articles | Metrics
    Survey and Prospect of Intelligent Interaction-Oriented Image Recognition Techniques
    Jiang Shuqiang, Min Weiqing, Wang Shuhui
    Journal of Computer Research and Development    2016, 53 (1): 113-122.   DOI: 10.7544/issn1000-1239.2016.20150689
    Abstract2648)   HTML84)    PDF (969KB)(4273)       Save
    Vision plays an important role in both the human interaction and human-nature interaction. Furthermore, equipping the terminals with the intelligent visual recognition and interaction is one of the core challenges in artificial intelligence and computer technology, and also one of lofty goals. With the rapid development of visual recognition techniques, in recent years the emerging new techniques and problems have been produced. Correspondingly, the applications with the intelligent interaction also present a few new characteristics, which are changing our original understanding of the visual recognition and interaction. We give a survey on image recognition techniques, covering recent advances in regarding to visual recognition, visual description, visual question and answering (VQA). Specifically, we first focus on the deep learning approaches for image recognition and scene classification. Next, the latest techniques in visual description and VQA are analyzed and discussed. Then we introduce visual recognition and interaction applications in mobile devices and robots. Finally, we discuss future research directions in this field.
    Related Articles | Metrics
    Edge Computing—An Emerging Computing Model for the Internet of Everything Era
    Shi Weisong, Sun Hui, Cao Jie, Zhang Quan, Liu Wei
    Journal of Computer Research and Development    2017, 54 (5): 907-924.   DOI: 10.7544/issn1000-1239.2017.20160941
    Abstract4949)   HTML228)    PDF (4113KB)(4166)       Save
    With the proliferation of Internet of things (IoT) and the burgeoning of 4G/5G network, we have seen the dawning of the IoE (Internet of everything) era, where there will be a huge volume of data generated by things that are immersed in our daily life, and hundreds of applications will be deployed at the edge to consume these data. Cloud computing as the de facto centralized big data processing platform is not efficient enough to support these applications emerging in IoE era, i.e., 1) the computing capacity available in the centralized cloud cannot keep up with the explosive growing computational needs of massive data generated at the edge of the network; 2) longer user-perceived latency caused by the data movement between the edge and the cloud;3) privacy and security concerns from data owners in the edge; 4) energy constraints of edge devices. These issues in the centralized big data processing era have pushed the horizon of a new computing paradigm, edge computing, which calls for processing the data at the edge of the network. Leveraging the power of cloud computing, edge computing has the potential to address the limitation of computing capability, the concerns of response time requirement, bandwidth cost saving, data safety and privacy, as well as battery life constraint. “Edge” in edge computing is defined as any computing and network resources along the path between data sources and cloud data centers. In this paper, we introduce the definition of edge computing, followed by several case studies, ranging from cloud offloading to smart home and city, as well as collaborative edge to materialize the concept of edge computing. Finally, we present several challenges and opportunities in the field of edge computing, and hope this paper will gain attention from the community and inspire more research in this direction.
    Related Articles | Metrics