ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2019, Vol. 56 ›› Issue (5): 1034-1047.doi: 10.7544/issn1000-1239.2019.20180461

Previous Articles     Next Articles

A Collaborative Filtering Recommendation Algorithm for Multi-Source Heterogeneous Data

Wu Bin, Lou Zhengzheng, Ye Yangdong   

  1. (School of Information Engineering, Zhengzhou University, Zhengzhou 450001)
  • Online:2019-05-01

Abstract: With the rapid development of electronic e-commerce sites, data characteristics and realistic demands have changed. The data, which has main characteristics of large-scale, multi-source and heterogeneous, is playing an important role. However, these unique characteristics of electronic e-commerce systems make most of existing collaborative filtering methods difficult to be adapted for product recommendation. The immediate problem to be solved is how to integrate multi-source heterogeneous data to achieve the maximum value of big data. In this paper, we first analyze the characteristics of various data among different information sources, and design different modeling solutions. Then, we propose a novel recommendation model for the task of rating prediction, which makes it possible to mitigate the sparsity problem via seamlessly integrating multi-relational data and visual contents. Finally, we devise a computationally efficient learning algorithm named MSRA (multi-source heterogeneous information based recommendation algorithm), to optimize the proposed model. To verify the effectiveness of our proposed model, we conduct extensive experiments on a wide spectrum of large-scale Amazon datasets. Experimental results demonstrate that 1)the designed algorithm consistently and significantly outperforms several state-of-the-art collaborative filtering algorithms, and 2)our algorithm is capable of alleviating the item cold-start problem and helping obtain more accurate results of various items.

Key words: matrix factorization, collaborative filtering, recommender systems, cold start, multi-source heterogeneous data

CLC Number: