Abstract:
The popularization of the network and the development of e-commerce have changed the way people access information and consume. For most of people, Web has been the important source of information. Meanwhile, information quality issue is becoming increasingly prominent. There is a lot of information which is outdated, incorrect, false and bias. Particularly, the problem of conflicting information provided by different websites is obvious. It has to be solved that how to find the truth from conflicting information. As we know, there is not a method which considers the credibility of data categories on data sources during discovering truth. So, we propose a problem which is truth discovery based credibility of data categories on data sources. In this paper, two methods are proposed to detect the credibility differences of data categories on sources, and a Bayesian method is used to iteratively compute the data sources quality and data accuracy. Additional, data coverage and the difficulty of each object is considered to improve the accuracy of truth finding. The experiments on a real data set show that our algorithms can significantly improve the accuracy of truth discovery.