ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2017, Vol. 54 ›› Issue (7): 1426-1438.doi: 10.7544/issn1000-1239.2017.20160302

• 人工智能 • 上一篇    下一篇

基于图和改进K近邻模型的高效协同过滤推荐算法

孟桓羽1,刘真1,王芳2,徐家栋1,张国强3   

  1. 1(北京交通大学计算机与信息技术学院 北京 100044);2(北京交通大学信息中心 北京 100044);3(南京师范大学计算机科学与技术学院 南京 210023) (huanyum@bjtu.edu.cn)
  • 出版日期: 2017-07-01
  • 基金资助: 
    国家重点研发计划项目(2016YFB1200100);国家自然科学基金项目(61202429,61572256);中央高校基本科研业务费专项资金项目(2015JBM042);江苏省自然科学基金项目(BK20141454)

An Efficient Collaborative Filtering Algorithm Based on Graph Model and Improved KNN

Meng Huanyu1, Liu Zhen1, Wang Fang2, Xu Jiadong1, Zhang Guoqiang3   

  1. 1(School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044);2(Information Technology Center, Beijing Jiaotong University, Beijing 100044);3(School of Computer Science and Technology, Nanjing Normal University, Nanjing 210023)
  • Online: 2017-07-01

摘要: 在互联网高速发展的今天,推荐系统已成为解决信息过载的有效手段,能够缓解用户在筛选感兴趣信息时的困扰,帮助用户发现有价值的信息.推荐系统中的协同过滤推荐算法,因其领域无关性及支持用户发现潜在兴趣的优点被广泛应用.由于数据的规模过大且稀疏的特点,当前协同过滤在算法实时性、推荐精确度等方面仍有较大提升空间.提出了GK-CF方法,通过建立基于图的评分数据模型,将传统的协同过滤算法与图计算及改进的KNN算法结合.通过图的消息传播及改进的相似度计算模型对用户先进行筛选再做相似度计算;以用户-项目二部图的节点结构为基础,通过图的最短路径算法进行待评分项目的快速定位.在此基础上,进一步通过并行图框架对算法进行了并行化实现及优化.在物理集群环境下进行了实验,结果表明,与已有的协同过滤算法相比,提出的GK-CF算法能够很好地提高推荐的准确度和评分预测的准确性,并具有较好的算法可扩展性和实时性能.

关键词: 协同过滤, 社会网络, 图模型, K近邻, 最短路径

Abstract: With the rapid development of Internet, recommender system has been considered as a typical method to deal with the over-loading of Internet information. The recommender system can partially alleviate user’s difficulty on information filtering and discover valuable information for the active user. Collaborative filtering algorithm has the advantages of domain independence and supports users’ potential interests. For these reasons, collaborative filtering has been widely used. Because the user item rating matrix is sparse and in large-scale, recommender system is facing big challenges of precision and performance. This paper puts forward a GK-CF algorithm. By building a graph-based rating data model, the traditional collaborative filtering, graph algorithms and improved KNN algorithm have been integrated. Through the message propagation in the graph and the improved user similarity calculation model, candidate similar users will be selected firstly before the calculation of users similarity. Based on the topology of bipartite graph, the GK-CF algorithm ensures the quick and precise location of the candidate items through the shortest path algorithm. Under the parallel graph framework, GK-CF algorithm has been parallelized design and implement. The experiments on real world clusters show that: compared with the traditional collaborative filtering algorithm, the GK-CF algorithm can better improve recommendation precision and the rating accuracy. The GK-CF algorithm also has good scalability and real-time performance.

Key words: collaborative filtering, social network, graph model, KNN, shortest path

中图分类号: