ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (5): 1034-1047.doi: 10.7544/issn1000-1239.2019.20180461

• 人工智能 • 上一篇    下一篇

一种面向多源异构数据的协同过滤推荐算法

吴宾,娄铮铮,叶阳东   

  1. (郑州大学信息工程学院 郑州 450001) (wubin@gs.zzu.edu.cn)
  • 出版日期: 2019-05-01
  • 基金资助: 
    国家重点研发计划项目(2018YFB1201403);国家自然科学基金项目(61772475,61502434)

A Collaborative Filtering Recommendation Algorithm for Multi-Source Heterogeneous Data

Wu Bin, Lou Zhengzheng, Ye Yangdong   

  1. (School of Information Engineering, Zhengzhou University, Zhengzhou 450001)
  • Online: 2019-05-01

摘要: 随着电子商务网站的快速发展,数据特征和现实需求均发生了较大变化.以大规模、多源性、异构性为主要特征的数据发挥着更加重要的作用.然而,电子商务系统中数据所具有的特性使得大多数协同过滤方法较难直接用于物品推荐.如何整合多源异构数据来实现数据的价值最大化是当前推荐系统亟待解决的问题.针对这一问题,首先分析了多源异构数据中各类数据的特点,并根据各自特点为其设计了不同的建模方式.其次,提出一种新颖的推荐模型用于评分预测任务,它通过融合多关系数据和视觉信息来缓解数据稀疏问题.最后,设计了一种高效的算法MSRA(multi-source heterogeneous information based recommendation algorithm)用于求解所提模型的参数.在多个亚马逊数据集上的实验结果表明:1)面向多源异构数据的推荐算法其性能明显优于当前主流协同过滤算法; 2)该算法不仅可以有效缓解物品的冷启动问题,而且能够更好地预测不同类型物品的实际评分.

关键词: 矩阵分解, 协同过滤, 推荐系统, 冷启动, 多源异构数据

Abstract: With the rapid development of electronic e-commerce sites, data characteristics and realistic demands have changed. The data, which has main characteristics of large-scale, multi-source and heterogeneous, is playing an important role. However, these unique characteristics of electronic e-commerce systems make most of existing collaborative filtering methods difficult to be adapted for product recommendation. The immediate problem to be solved is how to integrate multi-source heterogeneous data to achieve the maximum value of big data. In this paper, we first analyze the characteristics of various data among different information sources, and design different modeling solutions. Then, we propose a novel recommendation model for the task of rating prediction, which makes it possible to mitigate the sparsity problem via seamlessly integrating multi-relational data and visual contents. Finally, we devise a computationally efficient learning algorithm named MSRA (multi-source heterogeneous information based recommendation algorithm), to optimize the proposed model. To verify the effectiveness of our proposed model, we conduct extensive experiments on a wide spectrum of large-scale Amazon datasets. Experimental results demonstrate that 1)the designed algorithm consistently and significantly outperforms several state-of-the-art collaborative filtering algorithms, and 2)our algorithm is capable of alleviating the item cold-start problem and helping obtain more accurate results of various items.

Key words: matrix factorization, collaborative filtering, recommender systems, cold start, multi-source heterogeneous data

中图分类号: