Top Downloaded
- User statistics ranking in the last month (excluding this month)
- User statistics ranking within half a year (excluding this month)
- User statistics ranking within one year (excluding this month)
- User statistics ranking within two years (excluding this month)
- User statistics ranking within three years (excluding this month)
1
2013, 50(1): 146-169.
Abstract:
Data type and amount in human society is growing in amazing speed which is caused by emerging new services such as cloud computing, internet of things and social network, the era of big data has come. Data has been fundamental resource from simple dealing object, and how to manage and utilize big data better has attracted much attention. Evolution or revolution on database research for big data is a problem. This paper discusses the concept of big data, and surveys its state of the art. The framework of big data is described and key techniques are studied. Finally some new challenges in the future are summarized.
Data type and amount in human society is growing in amazing speed which is caused by emerging new services such as cloud computing, internet of things and social network, the era of big data has come. Data has been fundamental resource from simple dealing object, and how to manage and utilize big data better has attracted much attention. Evolution or revolution on database research for big data is a problem. This paper discusses the concept of big data, and surveys its state of the art. The framework of big data is described and key techniques are studied. Finally some new challenges in the future are summarized.
2
2016, 53(3): 582-600.
DOI: 10.7544/issn1000-1239.2016.20148228
Abstract:
Google’s knowledge graph technology has drawn a lot of research attentions in recent years. However, due to the limited public disclosure of technical details, people find it difficult to understand the connotation and value of this technology. In this paper, we introduce the key techniques involved in the construction of knowledge graph in a bottom-up way, starting from a clearly defined concept and a technical architecture of the knowledge graph. Firstly, we describe in detail the definition and connotation of the knowledge graph, and then we propose the technical framework for knowledge graph construction, in which the construction process is divided into three levels according to the abstract level of the input knowledge materials, including the information extraction layer, the knowledge integration layer, and the knowledge processing layer, respectively. Secondly, the research status of the key technologies for each level are surveyed comprehensively and also investigated critically for the purposes of gradually revealing the mysteries of the knowledge graph technology, the state-of-the-art progress, and its relationship with related disciplines. Finally, five major research challenges in this area are summarized, and the corresponding key research issues are highlighted.
Google’s knowledge graph technology has drawn a lot of research attentions in recent years. However, due to the limited public disclosure of technical details, people find it difficult to understand the connotation and value of this technology. In this paper, we introduce the key techniques involved in the construction of knowledge graph in a bottom-up way, starting from a clearly defined concept and a technical architecture of the knowledge graph. Firstly, we describe in detail the definition and connotation of the knowledge graph, and then we propose the technical framework for knowledge graph construction, in which the construction process is divided into three levels according to the abstract level of the input knowledge materials, including the information extraction layer, the knowledge integration layer, and the knowledge processing layer, respectively. Secondly, the research status of the key technologies for each level are surveyed comprehensively and also investigated critically for the purposes of gradually revealing the mysteries of the knowledge graph technology, the state-of-the-art progress, and its relationship with related disciplines. Finally, five major research challenges in this area are summarized, and the corresponding key research issues are highlighted.
3
2016, 53(2): 247-261.
DOI: 10.7544/issn1000-1239.2016.20160020
Abstract:
Knowledge bases are usually represented as networks with entities as nodes and relations as edges. With network representation of knowledge bases, specific algorithms have to be designed to store and utilize knowledge bases, which are usually time consuming and suffer from data sparsity issue. Recently, representation learning, delegated by deep learning, has attracted many attentions in natural language processing, computer vision and speech analysis. Representation learning aims to project the interested objects into a dense, real-valued and low-dimensional semantic space, whereas knowledge representation learning focuses on representation learning of entities and relations in knowledge bases. Representation learning can efficiently measure semantic correlations of entities and relations, alleviate sparsity issues, and significantly improve the performance of knowledge acquisition, fusion and inference. In this paper, we will introduce the recent advances of representation learning, summarize the key challenges and possible solutions, and further give a future outlook on the research and application directions.
Knowledge bases are usually represented as networks with entities as nodes and relations as edges. With network representation of knowledge bases, specific algorithms have to be designed to store and utilize knowledge bases, which are usually time consuming and suffer from data sparsity issue. Recently, representation learning, delegated by deep learning, has attracted many attentions in natural language processing, computer vision and speech analysis. Representation learning aims to project the interested objects into a dense, real-valued and low-dimensional semantic space, whereas knowledge representation learning focuses on representation learning of entities and relations in knowledge bases. Representation learning can efficiently measure semantic correlations of entities and relations, alleviate sparsity issues, and significantly improve the performance of knowledge acquisition, fusion and inference. In this paper, we will introduce the recent advances of representation learning, summarize the key challenges and possible solutions, and further give a future outlook on the research and application directions.
4
2013, 50(9): 1799-1804.
Abstract:
Machine learning is an important area of artificial intelligence. Since 1980s, huge success has been achieved in terms of algorithms, theory, and applications. From 2006, a new machine learning paradigm, named deep learning, has been popular in the research community, and has become a huge wave of technology trend for big data and artificial intelligence. Deep learning simulates the hierarchical structure of human brain, processing data from lower level to higher level, and gradually composing more and more semantic concepts. In recent years, Google, Microsoft, IBM, and Baidu have invested a lot of resources into the R&D of deep learning, making significant progresses on speech recognition, image understanding, natural language processing, and online advertising. In terms of the contribution to real-world applications, deep learning is perhaps the most successful progress made by the machine learning community in the last 10 years. In this article, we will give a high-level overview about the past and current stage of deep learning, discuss the main challenges, and share our views on the future development of deep learning.
Machine learning is an important area of artificial intelligence. Since 1980s, huge success has been achieved in terms of algorithms, theory, and applications. From 2006, a new machine learning paradigm, named deep learning, has been popular in the research community, and has become a huge wave of technology trend for big data and artificial intelligence. Deep learning simulates the hierarchical structure of human brain, processing data from lower level to higher level, and gradually composing more and more semantic concepts. In recent years, Google, Microsoft, IBM, and Baidu have invested a lot of resources into the R&D of deep learning, making significant progresses on speech recognition, image understanding, natural language processing, and online advertising. In terms of the contribution to real-world applications, deep learning is perhaps the most successful progress made by the machine learning community in the last 10 years. In this article, we will give a high-level overview about the past and current stage of deep learning, discuss the main challenges, and share our views on the future development of deep learning.
5
2019, 56(1): 69-89.
DOI: 10.7544/issn1000-1239.2019.20180760
Abstract:
With the burgeoning of the Internet of everything, the amount of data generated by edge devices increases dramatically, resulting in higher network bandwidth requirements. In the meanwhile, the emergence of novel applications calls for the lower latency of the network. It is an unprecedented challenge to guarantee the quality of service while dealing with a massive amount of data for cloud computing, which has pushed the horizon of edge computing. Edge computing calls for processing the data at the edge of the network and develops rapidly from 2014 as it has the potential to reduce latency and bandwidth charges, address the limitation of computing capability of cloud data center, increase availability as well as protect data privacy and security. This paper mainly discusses three questions about edge computing: where does it come from, what is the current status and where is it going? This paper first sorts out the development process of edge computing and divides it into three periods: technology preparation period, rapid growth period and steady development period. This paper then summarizes seven essential technologies that drive the rapid development of edge computing. After that, six typical applications that have been widely used in edge computing are illustrated. Finally, this paper proposes six open problems that need to be solved urgently in future development.
With the burgeoning of the Internet of everything, the amount of data generated by edge devices increases dramatically, resulting in higher network bandwidth requirements. In the meanwhile, the emergence of novel applications calls for the lower latency of the network. It is an unprecedented challenge to guarantee the quality of service while dealing with a massive amount of data for cloud computing, which has pushed the horizon of edge computing. Edge computing calls for processing the data at the edge of the network and develops rapidly from 2014 as it has the potential to reduce latency and bandwidth charges, address the limitation of computing capability of cloud data center, increase availability as well as protect data privacy and security. This paper mainly discusses three questions about edge computing: where does it come from, what is the current status and where is it going? This paper first sorts out the development process of edge computing and divides it into three periods: technology preparation period, rapid growth period and steady development period. This paper then summarizes seven essential technologies that drive the rapid development of edge computing. After that, six typical applications that have been widely used in edge computing are illustrated. Finally, this paper proposes six open problems that need to be solved urgently in future development.
6
2017, 54(10): 2170-2186.
DOI: 10.7544/issn1000-1239.2017.20170471
Abstract:
Core features of the blockchain technology are “de-centralization” and “de-trusting”. As a distributed ledger technology, smart contract infrastructure platform and novel distributed computing paradigm, it can effectively build programmable currency, programmable finance and programmable society, which will have a far-reaching impact on the financial and other fields, and drive a new round of technological change and application change. While blockchain technology can improve efficiency, reduce costs and enhance data security, it is still in the face of serious privacy issues which have been widely concerned by researchers. The survey first analyzes the technical characteristics of the blockchain, defines the concept of identity privacy and transaction privacy, points out the advantages and disadvantages of blockchain technology in privacy protection and introduces the attack methods in existing researches, such as transaction tracing technology and account clustering technology. And then we introduce a variety of privacy mechanisms, including malicious nodes detection and restricting access technology for the network layer, transaction mixing technology, encryption technology and limited release technology for the transaction layer, and some defense mechanisms for blockchain applications layer. In the end, we discuss the limitations of the existing technologies and envision future directions on this topic. In addition, the regulatory approach to malicious use of blockchain technology is discussed.
Core features of the blockchain technology are “de-centralization” and “de-trusting”. As a distributed ledger technology, smart contract infrastructure platform and novel distributed computing paradigm, it can effectively build programmable currency, programmable finance and programmable society, which will have a far-reaching impact on the financial and other fields, and drive a new round of technological change and application change. While blockchain technology can improve efficiency, reduce costs and enhance data security, it is still in the face of serious privacy issues which have been widely concerned by researchers. The survey first analyzes the technical characteristics of the blockchain, defines the concept of identity privacy and transaction privacy, points out the advantages and disadvantages of blockchain technology in privacy protection and introduces the attack methods in existing researches, such as transaction tracing technology and account clustering technology. And then we introduce a variety of privacy mechanisms, including malicious nodes detection and restricting access technology for the network layer, transaction mixing technology, encryption technology and limited release technology for the transaction layer, and some defense mechanisms for blockchain applications layer. In the end, we discuss the limitations of the existing technologies and envision future directions on this topic. In addition, the regulatory approach to malicious use of blockchain technology is discussed.
7
2016, 53(1): 165-192.
DOI: 10.7544/issn1000-1239.2016.20150661
Abstract:
Entity alignment on knowledge base has been a hot research topic in recent years. The goal is to link multiple knowledge bases effectively and create a large-scale and unified knowledge base from the top-level to enrich the knowledge base, which can be used to help machines to understand the data and build more intelligent applications. However, there are still many research challenges on data quality and scalability, especially in the background of big data. In this paper, we present a survey on the techniques and algorithms of entity alignment on knowledge base in decade, and expect to provide alternative options for further research by classifying and summarizing the existing methods. Firstly, the entity alignment problem is formally defined. Secondly, the overall architecture is summarized and the research progress is reviewed in detail from algorithms, feature matching and indexing aspects. The entity alignment algorithms are the key points to solve this problem, and can be divided into pair-wise methods and collective methods. The most commonly used collective entity alignment algorithms are discussed in detail from local and global aspects. Some important experimental and real world data sets are introduced as well. Finally, open research issues are discussed and possible future research directions are prospected.
Entity alignment on knowledge base has been a hot research topic in recent years. The goal is to link multiple knowledge bases effectively and create a large-scale and unified knowledge base from the top-level to enrich the knowledge base, which can be used to help machines to understand the data and build more intelligent applications. However, there are still many research challenges on data quality and scalability, especially in the background of big data. In this paper, we present a survey on the techniques and algorithms of entity alignment on knowledge base in decade, and expect to provide alternative options for further research by classifying and summarizing the existing methods. Firstly, the entity alignment problem is formally defined. Secondly, the overall architecture is summarized and the research progress is reviewed in detail from algorithms, feature matching and indexing aspects. The entity alignment algorithms are the key points to solve this problem, and can be divided into pair-wise methods and collective methods. The most commonly used collective entity alignment algorithms are discussed in detail from local and global aspects. Some important experimental and real world data sets are introduced as well. Finally, open research issues are discussed and possible future research directions are prospected.
8
2020, 57(2): 346-362.
DOI: 10.7544/issn1000-1239.2020.20190455
Abstract:
Large-scale data collection has vastly improved the performance of machine learning, and achieved a win-win situation for both economic and social benefits, while personal privacy preservation is facing new and greater risks and crises. In this paper, we summarize the privacy issues in machine learning and the existing work on privacy-preserving machine learning. We respectively discuss two settings of the model training process—centralized learning and federated learning. The former needs to collect all the user data before training. Although this setting is easy to deploy, it still exists enormous privacy and security hidden troubles. The latter achieves that massive devices can collaborate to train a global model while keeping their data in local. As it is currently in the early stage of the study, it also has many problems to be solved. The existing work on privacy-preserving techniques can be concluded into two main clues—the encryption method including homomorphic encryption and secure multi-party computing and the perturbation method represented by differential privacy, each having its advantages and disadvantages. In this paper, we first focus on the design of differentially-private machine learning algorithm, especially under centralized setting, and discuss the differences between traditional machine learning models and deep learning models. Then, we summarize the problems existing in the current federated learning study. Finally, we propose the main challenges in the future work and point out the connection among privacy protection, model interpretation and data transparency.
Large-scale data collection has vastly improved the performance of machine learning, and achieved a win-win situation for both economic and social benefits, while personal privacy preservation is facing new and greater risks and crises. In this paper, we summarize the privacy issues in machine learning and the existing work on privacy-preserving machine learning. We respectively discuss two settings of the model training process—centralized learning and federated learning. The former needs to collect all the user data before training. Although this setting is easy to deploy, it still exists enormous privacy and security hidden troubles. The latter achieves that massive devices can collaborate to train a global model while keeping their data in local. As it is currently in the early stage of the study, it also has many problems to be solved. The existing work on privacy-preserving techniques can be concluded into two main clues—the encryption method including homomorphic encryption and secure multi-party computing and the perturbation method represented by differential privacy, each having its advantages and disadvantages. In this paper, we first focus on the design of differentially-private machine learning algorithm, especially under centralized setting, and discuss the differences between traditional machine learning models and deep learning models. Then, we summarize the problems existing in the current federated learning study. Finally, we propose the main challenges in the future work and point out the connection among privacy protection, model interpretation and data transparency.
9
2015, 52(1): 16-26.
DOI: 10.7544/issn1000-1239.2015.20140107
Abstract:
With the fast growth of big data, statistical machine learning has attracted tremendous attention from both industry and academia, with many successful applications in vision, speech, natural language, and biology. In particular, the last decades have seen the fast development of Bayesian machine learning, which is now representing a very important class of techniques. In this article, we provide an overview of the recent advances in Bayesian machine learning, including the basics of Bayesian machine learning theory and methods, nonparametric Bayesian methods and inference algorithms, and regularized Bayesian inference. Finally, we also highlight the challenges and recent progress on large-scale Bayesian learning for big data, and discuss on some future directions.
With the fast growth of big data, statistical machine learning has attracted tremendous attention from both industry and academia, with many successful applications in vision, speech, natural language, and biology. In particular, the last decades have seen the fast development of Bayesian machine learning, which is now representing a very important class of techniques. In this article, we provide an overview of the recent advances in Bayesian machine learning, including the basics of Bayesian machine learning theory and methods, nonparametric Bayesian methods and inference algorithms, and regularized Bayesian inference. Finally, we also highlight the challenges and recent progress on large-scale Bayesian learning for big data, and discuss on some future directions.
10
2017, 54(10): 2130-2143.
DOI: 10.7544/issn1000-1239.2017.20170470
Abstract:
With the development of smart home, intelligent care and smart car, the application fields of IoT are becoming more and more widespread, and its security and privacy receive more attention by researchers. Currently, the related research on the security of the IoT is still in its initial stage, and most of the research results cannot solve the major security problem in the development of the IoT well. In this paper, we firstly introduce the three-layer logic architecture of the IoT, and outline the security problems and research priorities of each level. Then we discuss the security issues such as privacy preserving and intrusion detection, which need special attention in the IoT main application scenarios (smart home, intelligent healthcare, car networking, smart grid, and other industrial infrastructure). Though synthesizing and analyzing the deficiency of existing research and the causes of security problem, we point out five major technical challenges in IoT security. They are privacy protection in data sharing, the equipment security protection under limited resources, more effective intrusion detection and defense systems and method, access control of equipment automation operations and cross-domain authentication of motive device. We finally detail every technical challenge and point out the IoT security research hotspots in future.
With the development of smart home, intelligent care and smart car, the application fields of IoT are becoming more and more widespread, and its security and privacy receive more attention by researchers. Currently, the related research on the security of the IoT is still in its initial stage, and most of the research results cannot solve the major security problem in the development of the IoT well. In this paper, we firstly introduce the three-layer logic architecture of the IoT, and outline the security problems and research priorities of each level. Then we discuss the security issues such as privacy preserving and intrusion detection, which need special attention in the IoT main application scenarios (smart home, intelligent healthcare, car networking, smart grid, and other industrial infrastructure). Though synthesizing and analyzing the deficiency of existing research and the causes of security problem, we point out five major technical challenges in IoT security. They are privacy protection in data sharing, the equipment security protection under limited resources, more effective intrusion detection and defense systems and method, access control of equipment automation operations and cross-domain authentication of motive device. We finally detail every technical challenge and point out the IoT security research hotspots in future.
11
2008, 45(9): 1501-1508.
Abstract:
In mathematics and applications, quantum annealing is a new method for finding solutions to combinatorial optimization problems and ground states of glassy systems using quantum fluctuations. Quantum fluctuations can be simulated in computers using various quantum Monte Carlo techniques, such as the path integral Monte Carlo method, and thus they can be used to obtain a new kind of heuristic algorithm for global optimization. It can be said that the idea of quantum annealing comes from the celebrated classical simulated thermal annealing invented by Kirkpatrick. However, unlike a simulated annealing algorithm, which utilizes thermal fluctuations to help the algorithm jump from local optimum to global optimum, quantum annealing algorithms utilize quantum fluctuations to help the algorithm tunnel through the barriers directly from local optimum to global optimum. According to the previous studies, although the quantum annealing algorithm is not capable, in general, of finding solutions to NP-complete problems in polynomial time, quantum annealing is still a promising optimization technique, which exhibits good performances on some typical optimization problems, such as the transverse Ising model and the traveling salesman problem. Provided in this paper is an overview of the principles and research progresses of quantum annealing algorithms in recent years; several different kinds of quantum annealing algorithms are presented in detail; both the advantages and disadvantages of each algorithm are analyzed; and prospects for the research orientation of the quantum annealing algorithm in future are given.
In mathematics and applications, quantum annealing is a new method for finding solutions to combinatorial optimization problems and ground states of glassy systems using quantum fluctuations. Quantum fluctuations can be simulated in computers using various quantum Monte Carlo techniques, such as the path integral Monte Carlo method, and thus they can be used to obtain a new kind of heuristic algorithm for global optimization. It can be said that the idea of quantum annealing comes from the celebrated classical simulated thermal annealing invented by Kirkpatrick. However, unlike a simulated annealing algorithm, which utilizes thermal fluctuations to help the algorithm jump from local optimum to global optimum, quantum annealing algorithms utilize quantum fluctuations to help the algorithm tunnel through the barriers directly from local optimum to global optimum. According to the previous studies, although the quantum annealing algorithm is not capable, in general, of finding solutions to NP-complete problems in polynomial time, quantum annealing is still a promising optimization technique, which exhibits good performances on some typical optimization problems, such as the transverse Ising model and the traveling salesman problem. Provided in this paper is an overview of the principles and research progresses of quantum annealing algorithms in recent years; several different kinds of quantum annealing algorithms are presented in detail; both the advantages and disadvantages of each algorithm are analyzed; and prospects for the research orientation of the quantum annealing algorithm in future are given.
12
2018, 55(12): 2587-2599.
DOI: 10.7544/issn1000-1239.2018.20180623
Abstract:
With the advent of the medical big data era, knowledge interconnection has received extensive attention. How to extract useful medical knowledge from massive data is the key for medical big data analysis. Knowledge graph technology provides a means to extract structured knowledge from massive texts and images.The combination of knowledge graph, big data technology and deep learning technology is becoming the core driving force for the development of artificial intelligence. The knowledge graph technology has a broad application prospect in the medical domain. The application of knowledge graph technology in the medical domain will play an important role in solving the contradiction between the supply of high-quality medical resources and the continuous increase of demand for medical services.At present, the research on medical knowledge graph is still in the exploratory stage. The existing knowledge graph technology generally has several problems such as low efficiency, multiple restrictions and poor expansion in the medical domain. This paper firstly analyzes the medical knowledge graph architecture and construction technology for the strong professionalism and complex structure of big data in the medical domain. Secondly, the key technologies and research progress of the three modules of knowledge extraction, knowledge expression, knowledge fusion and knowledge reasoning in medical knowledge map are summarized. In addition, the application status of medical knowledge maps in clinical decision support, medical intelligence semantic retrieval, medical question answering system and other medical services are introduced. Finally, the existing problems and challenges of current research are discussed and analyzed, and its development is prospected.
With the advent of the medical big data era, knowledge interconnection has received extensive attention. How to extract useful medical knowledge from massive data is the key for medical big data analysis. Knowledge graph technology provides a means to extract structured knowledge from massive texts and images.The combination of knowledge graph, big data technology and deep learning technology is becoming the core driving force for the development of artificial intelligence. The knowledge graph technology has a broad application prospect in the medical domain. The application of knowledge graph technology in the medical domain will play an important role in solving the contradiction between the supply of high-quality medical resources and the continuous increase of demand for medical services.At present, the research on medical knowledge graph is still in the exploratory stage. The existing knowledge graph technology generally has several problems such as low efficiency, multiple restrictions and poor expansion in the medical domain. This paper firstly analyzes the medical knowledge graph architecture and construction technology for the strong professionalism and complex structure of big data in the medical domain. Secondly, the key technologies and research progress of the three modules of knowledge extraction, knowledge expression, knowledge fusion and knowledge reasoning in medical knowledge map are summarized. In addition, the application status of medical knowledge maps in clinical decision support, medical intelligence semantic retrieval, medical question answering system and other medical services are introduced. Finally, the existing problems and challenges of current research are discussed and analyzed, and its development is prospected.
13
2015, 52(10): 2373-2381.
DOI: 10.7544/issn1000-1239.2015.20150562
Abstract:
Anonymous communication technique is one of the main privacy-preserving techniques, which has been widely used to protect Internet users’ privacy. However, existing anonymous communication systems are particularly vulnerable to traffic analysis, and researchers have been improving unobservability of systems against Internet censorship and surveillance. However, how to quantify the degree of unobservability is a key challenge in anonymous communication systems. We model anonymous communication systems as an alternating turing machine, and analyze adversaries’ threat model. Based on this model, this paper proposes a relative entropy approach that allows to quantify the degree of unobservability for anonymous communication systems. The degree of unobservability is based on the probabilities of the observed flow patterns by attackers. We also apply this approach to measure the pluggable transports of TOR, and show how to calculate it for comparing the level of unobservability of these systems. The experimental results show that it is useful to evaluate the level of unobservability of anonymous communication systems. Finally, we present the conclusion and discuss future work on measuring unobservability in anonymous communication systems.
Anonymous communication technique is one of the main privacy-preserving techniques, which has been widely used to protect Internet users’ privacy. However, existing anonymous communication systems are particularly vulnerable to traffic analysis, and researchers have been improving unobservability of systems against Internet censorship and surveillance. However, how to quantify the degree of unobservability is a key challenge in anonymous communication systems. We model anonymous communication systems as an alternating turing machine, and analyze adversaries’ threat model. Based on this model, this paper proposes a relative entropy approach that allows to quantify the degree of unobservability for anonymous communication systems. The degree of unobservability is based on the probabilities of the observed flow patterns by attackers. We also apply this approach to measure the pluggable transports of TOR, and show how to calculate it for comparing the level of unobservability of these systems. The experimental results show that it is useful to evaluate the level of unobservability of anonymous communication systems. Finally, we present the conclusion and discuss future work on measuring unobservability in anonymous communication systems.
14
Abstract:
Motivated by sustainable development requirements of global environment and modern cities, the concept of the Smart City has been introduced as a strategic device of future urbanization on a global scale. On the other hand, modern cities have built up developed information infrastructure and gathered massive city running data, and therefore are ready to face the coming of the Smart City concept, technologies and applications. An important peculiarity of Smart City is that the technology system is data-centric. The data science and technologies, such as big data, data vitalization, and data mining, play pivotal roles in Smart City related technologies. In this paper, we provide a comprehensive survey of the most recent research activities in data-centric Smart City. The survey is from an informatics perspective and all summarized Smart City works are based on data science and technologies. This paper first summarizes the variety and analyze the feature of urban data that are used in existing Smart City researches and applications. Then, the state-of-the-art progresses in the research of data-centric Smart City are surveyed from two aspects: research activities and research specialties. The research activities are introduced from system architectures, smart transportation, urban computing, and human mobility. The research specialties are introduced from core technologies and theory, interdisciplinary, the data-centric, and the regional feature. Finally, the paper raises some directions for future works.
Motivated by sustainable development requirements of global environment and modern cities, the concept of the Smart City has been introduced as a strategic device of future urbanization on a global scale. On the other hand, modern cities have built up developed information infrastructure and gathered massive city running data, and therefore are ready to face the coming of the Smart City concept, technologies and applications. An important peculiarity of Smart City is that the technology system is data-centric. The data science and technologies, such as big data, data vitalization, and data mining, play pivotal roles in Smart City related technologies. In this paper, we provide a comprehensive survey of the most recent research activities in data-centric Smart City. The survey is from an informatics perspective and all summarized Smart City works are based on data science and technologies. This paper first summarizes the variety and analyze the feature of urban data that are used in existing Smart City researches and applications. Then, the state-of-the-art progresses in the research of data-centric Smart City are surveyed from two aspects: research activities and research specialties. The research activities are introduced from system architectures, smart transportation, urban computing, and human mobility. The research specialties are introduced from core technologies and theory, interdisciplinary, the data-centric, and the regional feature. Finally, the paper raises some directions for future works.
15
2020, 57(7): 1424-1448.
DOI: 10.7544/issn1000-1239.2020.20190358
Abstract:
There is a phenomenon that information extraction has long been concerned by a lot of research works in the field of natural language processing. Information extraction mainly includes three sub-tasks: entity extraction, relation extraction and event extraction, among which relation extraction is the core mission and a great significant part of information extraction. Furthermore, the main goal of entity relation extraction is to identify and determine the specific relation between entity pairs from plenty of natural language texts, which provides fundamental support for intelligent retrieval, semantic analysis, etc, and improves both search efficiency and the automatic construction of the knowledge base. Then, we briefly expound the development of entity relation extraction and introduce several tools and evaluation systems of relation extraction in both Chinese and English. In addition, four main methods of entity relation extraction are mentioned in this paper, including traditional relation extraction methods, and other three methods respectively based on traditional machine learning, deep learning and open domain. What is more important is that we summarize the mainstream research methods and corresponding representative results in different historical stages, and conduct contrastive analysis concerning different entity relation extraction methods. In the end, we forecast the contents and trend of future research.
There is a phenomenon that information extraction has long been concerned by a lot of research works in the field of natural language processing. Information extraction mainly includes three sub-tasks: entity extraction, relation extraction and event extraction, among which relation extraction is the core mission and a great significant part of information extraction. Furthermore, the main goal of entity relation extraction is to identify and determine the specific relation between entity pairs from plenty of natural language texts, which provides fundamental support for intelligent retrieval, semantic analysis, etc, and improves both search efficiency and the automatic construction of the knowledge base. Then, we briefly expound the development of entity relation extraction and introduce several tools and evaluation systems of relation extraction in both Chinese and English. In addition, four main methods of entity relation extraction are mentioned in this paper, including traditional relation extraction methods, and other three methods respectively based on traditional machine learning, deep learning and open domain. What is more important is that we summarize the mainstream research methods and corresponding representative results in different historical stages, and conduct contrastive analysis concerning different entity relation extraction methods. In the end, we forecast the contents and trend of future research.
16
2022, 59(1): 47-80.
DOI: 10.7544/issn1000-1239.20201055
Abstract:
In recent years, the application of deep learning related to graph structure data has attracted more and more attention. The emergence of graph neural network has made major breakthroughs in the above tasks, such as social networking, natural language processing, computer vision, even life sciences and other fields. The graph neural network can treat the actual problem as the connection between nodes in the graph and the message propagation problem, and the dependence between nodes can be modeled, so that the graph structure data can be handled well. In view of this, the graph neural network model and its application are systematically reviewed. Firstly, the graph convolutional neural network is explained from three aspects: spectral domain, spatial domain and pooling. Then, the graph neural network model based on the attention mechanism and autoencoder is described, and some graph neural network implemented by other methods are supplemented. Secondly, it summarizes the discussion and analysis on whether the graph neural network can be bigger and deeper. Furthermore, four frameworks of graph neural network are summarized. It also explains in detail the application of graph neural network in natural language processing and computer vision, etc. Finally, the future research of graph neural network is prospected and summarized. Compared with existing review articles on graph neural network, it elaborates the knowledge of spectral theory in detail, and comprehensively summarizes the development history of graph convolutional neural network based on the spectral domain. At the same time, a new classification standard, an improved model for the low efficiency of the spatial domain graph convolutional neural network, is given. And for the first time, it summarizes the discussion and analysis of graph neural network expression ability, theoretical guarantee, etc., and adds a new framework model. In the application part, the latest application of graph neural network is explained.
In recent years, the application of deep learning related to graph structure data has attracted more and more attention. The emergence of graph neural network has made major breakthroughs in the above tasks, such as social networking, natural language processing, computer vision, even life sciences and other fields. The graph neural network can treat the actual problem as the connection between nodes in the graph and the message propagation problem, and the dependence between nodes can be modeled, so that the graph structure data can be handled well. In view of this, the graph neural network model and its application are systematically reviewed. Firstly, the graph convolutional neural network is explained from three aspects: spectral domain, spatial domain and pooling. Then, the graph neural network model based on the attention mechanism and autoencoder is described, and some graph neural network implemented by other methods are supplemented. Secondly, it summarizes the discussion and analysis on whether the graph neural network can be bigger and deeper. Furthermore, four frameworks of graph neural network are summarized. It also explains in detail the application of graph neural network in natural language processing and computer vision, etc. Finally, the future research of graph neural network is prospected and summarized. Compared with existing review articles on graph neural network, it elaborates the knowledge of spectral theory in detail, and comprehensively summarizes the development history of graph convolutional neural network based on the spectral domain. At the same time, a new classification standard, an improved model for the low efficiency of the spatial domain graph convolutional neural network, is given. And for the first time, it summarizes the discussion and analysis of graph neural network expression ability, theoretical guarantee, etc., and adds a new framework model. In the application part, the latest application of graph neural network is explained.
17
2005, 42(1): 94-101.
Abstract:
With the rapid development of World Wide Web, text classification has become the key technology in organizing and processing large amount of document data. Maximum entropy model is a probability estimation technique widely used for a variety of natural language tasks. It offers a clean and accommodable frame to combine diverse pieces of contextual information to estimate the probability of a certain linguistics phenomena. This approach for many tasks of NLP perform near state-of-the-art level, or outperform other competing probability methods when trained and tested under similar conditions. However, relatively little work has been done on applying maximum entropy model to text categorization problems. In addition, no previous work has focused on using maximum entropy model in classifying Chinese documents. Maximum entropy model is used for text categorization. Its categorization performance is compared and analyzed using different approaches for text feature generation, different number of feature and smoothng technique. Moreover, in experiments it is compared to Bayes, KNN and SVM, and it is shown that its performance is higher than Bayes and comparable with KNN and SVM. It is a promising technique for text categorization.
With the rapid development of World Wide Web, text classification has become the key technology in organizing and processing large amount of document data. Maximum entropy model is a probability estimation technique widely used for a variety of natural language tasks. It offers a clean and accommodable frame to combine diverse pieces of contextual information to estimate the probability of a certain linguistics phenomena. This approach for many tasks of NLP perform near state-of-the-art level, or outperform other competing probability methods when trained and tested under similar conditions. However, relatively little work has been done on applying maximum entropy model to text categorization problems. In addition, no previous work has focused on using maximum entropy model in classifying Chinese documents. Maximum entropy model is used for text categorization. Its categorization performance is compared and analyzed using different approaches for text feature generation, different number of feature and smoothng technique. Moreover, in experiments it is compared to Bayes, KNN and SVM, and it is shown that its performance is higher than Bayes and comparable with KNN and SVM. It is a promising technique for text categorization.
18
2014, 51(4): 781-788.
Abstract:
Because the conventional change-points detection method exists the shortages on time delay and inapplicability for the time series mingled with outliers in the practical applications, an outlier and change-point detection algorithm for time series, which is based on the wavelet transform of the efficient score vector, is proposed in this paper. The algorithm introduces the efficient score vector to solve the problem of the conventional detection method that statistics often increass infinitely with the number of data added during the process of detection, and proposes a strategy of analyzing the statistics by using wavelet in order to avoid the serious time delay. In order to distinguish the outlier and change-point during the detection process, we propose a detecting framework based on the relationship between Lipschitz exponent and the wavelet coefficients, by which both outlier and change-point can be detected out meanwhile. The advantage of this method is that the detection effect is not subject to the influence of the outlier. It means that the method can deal with the time series containing both outliers and change-points under actual operating conditions and it is more suitable for the real application. Eventually, the effectiveness and practicality of the proposed detection method have been proved through simulation results.
Because the conventional change-points detection method exists the shortages on time delay and inapplicability for the time series mingled with outliers in the practical applications, an outlier and change-point detection algorithm for time series, which is based on the wavelet transform of the efficient score vector, is proposed in this paper. The algorithm introduces the efficient score vector to solve the problem of the conventional detection method that statistics often increass infinitely with the number of data added during the process of detection, and proposes a strategy of analyzing the statistics by using wavelet in order to avoid the serious time delay. In order to distinguish the outlier and change-point during the detection process, we propose a detecting framework based on the relationship between Lipschitz exponent and the wavelet coefficients, by which both outlier and change-point can be detected out meanwhile. The advantage of this method is that the detection effect is not subject to the influence of the outlier. It means that the method can deal with the time series containing both outliers and change-points under actual operating conditions and it is more suitable for the real application. Eventually, the effectiveness and practicality of the proposed detection method have been proved through simulation results.
19
2014, 51(10): 2137-2147.
DOI: 10.7544/issn1000-1239.2014.20130575
Abstract:
Taking advantage of the energy characteristics of complex system, a concept of influence is introduced to research community detection method, so that community structure could be discovered effectively. With regard to the stock closing price, by introducing the definition of influence and node centrality, a stock network is construted with influence which is regarded as the edge weight. This paper proposes an algorithm named stock network hierarchical clustering based on the influence calculating model, which is referred to as BCNHC algorithm. Firstly, BCNHC algorithm introduces the definition of nodes’ activity and influence, and puts forward the influence calculating model of node in networks in addition. Then, on the basis of measure criterion of the node centrality, the nodes with large node centrality value as the center nodes are selected, and the nodes’ Intimacy and influence model are utilized to ensure the influence of association between neighbor nodes. Furthermore, the node with minimum degree is gathering toward to center nodes, so as to reduce the error clustering caused by the uncertainty of which community neighbor nodes belong to. On the basis, the neighbor communities are clustered with the average influence of association of communities. It guarantees that influence of association reach to maximization for all the nodes in the community, until the entire networks’ modularity come to maximum. At last, comparison and analysis of experimental on stock network prove the feasibility of BCNHC algorithm.
Taking advantage of the energy characteristics of complex system, a concept of influence is introduced to research community detection method, so that community structure could be discovered effectively. With regard to the stock closing price, by introducing the definition of influence and node centrality, a stock network is construted with influence which is regarded as the edge weight. This paper proposes an algorithm named stock network hierarchical clustering based on the influence calculating model, which is referred to as BCNHC algorithm. Firstly, BCNHC algorithm introduces the definition of nodes’ activity and influence, and puts forward the influence calculating model of node in networks in addition. Then, on the basis of measure criterion of the node centrality, the nodes with large node centrality value as the center nodes are selected, and the nodes’ Intimacy and influence model are utilized to ensure the influence of association between neighbor nodes. Furthermore, the node with minimum degree is gathering toward to center nodes, so as to reduce the error clustering caused by the uncertainty of which community neighbor nodes belong to. On the basis, the neighbor communities are clustered with the average influence of association of communities. It guarantees that influence of association reach to maximization for all the nodes in the community, until the entire networks’ modularity come to maximum. At last, comparison and analysis of experimental on stock network prove the feasibility of BCNHC algorithm.
20
2016, 53(1): 113-122.
DOI: 10.7544/issn1000-1239.2016.20150689
Abstract:
Vision plays an important role in both the human interaction and human-nature interaction. Furthermore, equipping the terminals with the intelligent visual recognition and interaction is one of the core challenges in artificial intelligence and computer technology, and also one of lofty goals. With the rapid development of visual recognition techniques, in recent years the emerging new techniques and problems have been produced. Correspondingly, the applications with the intelligent interaction also present a few new characteristics, which are changing our original understanding of the visual recognition and interaction. We give a survey on image recognition techniques, covering recent advances in regarding to visual recognition, visual description, visual question and answering (VQA). Specifically, we first focus on the deep learning approaches for image recognition and scene classification. Next, the latest techniques in visual description and VQA are analyzed and discussed. Then we introduce visual recognition and interaction applications in mobile devices and robots. Finally, we discuss future research directions in this field.
Vision plays an important role in both the human interaction and human-nature interaction. Furthermore, equipping the terminals with the intelligent visual recognition and interaction is one of the core challenges in artificial intelligence and computer technology, and also one of lofty goals. With the rapid development of visual recognition techniques, in recent years the emerging new techniques and problems have been produced. Correspondingly, the applications with the intelligent interaction also present a few new characteristics, which are changing our original understanding of the visual recognition and interaction. We give a survey on image recognition techniques, covering recent advances in regarding to visual recognition, visual description, visual question and answering (VQA). Specifically, we first focus on the deep learning approaches for image recognition and scene classification. Next, the latest techniques in visual description and VQA are analyzed and discussed. Then we introduce visual recognition and interaction applications in mobile devices and robots. Finally, we discuss future research directions in this field.
- First
- Prev
- 1
- 2
- 3
- 4
- 5
- Next
- Last
- Total 5 Pages
- To
- Go