Abstract:
Data characteristics and realistic demands have changed because of the large-scale data’s links and crossover. The data, which has main features of large scale, multi-source heterogeneous, cross domain, cross media, cross language, dynamic evolution and generalization, is playing an important role. And the corresponding data storage, analysis and understanding are also facing a major challenge. The immediate problem to be solved is how to use the data association, cross and integration to achieve the maximization of the value of big data. Our paper believes that the key to solve this problem lies in the integration of data, so we put forward the concept of large data fusion. We use Web data, scientific data and business data fusion as a case to analyze the demand and necessity of data fusion, and propose a new task of large data fusion, but also summarize and analyze the existing fusion technologies. Finally, we analyze the challenges that may be faced in the process of large data fusion and the problems caused by large data fusion.