ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2016, Vol. 53 ›› Issue (12): 2793-2800.doi: 10.7544/issn1000-1239.2016.20160582

• 其他应用技术 • 上一篇    下一篇



  1. (东华大学计算机科学与技术学院 上海 201620) (
  • 出版日期: 2016-12-01
  • 基金资助: 

A Method of Bayesian Probabilistic Matrix Factorization Based on Generalized Gaussian Distribution

Yan Cairong, Zhang Qinglong, Zhao Xue,Huang Yongfeng   

  1. (School of Computer Science and Technology, Donghua University, Shanghai 201620)
  • Online: 2016-12-01

摘要: 贝叶斯概率矩阵分解方法因较高的预测准确度和良好的可扩展性,常用于个性化推荐系统,但其推荐精度会受初始评分矩阵稀疏特性的影响.提出一种基于广义高斯分布的贝叶斯概率矩阵分解方法GBPMF(generalized Gaussian distribution Bayesian PMF),采用广义高斯分布作为先验分布,通过机器学习自动选择最优的模型参数,并基于Gibbs采样进行高效训练,从而有效缓解矩阵的稀疏性,减小预测误差.同时考虑到评分时差因素对预测过程的影响,在采样算法中添加时间因子,进一步对方法进行优化,提高预测精度.实验结果表明:GBPMF方法及其优化方法GBPMF-T对非稀疏矩阵和稀疏矩阵均具有较高的精度,后者精度更高.当矩阵非常稀疏时,传统贝叶斯概率矩阵分解方法的精度急剧降低,而该方法则具有较好的稳定性.

关键词: 个性化推荐系统, 贝叶斯概率矩阵分解, 机器学习, 广义高斯分布, 稀疏矩阵

Abstract: The method of Bayesian probability matrix factorization (Bayesian PMF) is widely used in the personalized recommendation systems due to its high prediction accuracy and excellent scalability. However, the accuracy is affected greatly by the sparsity of the initial scoring matrix. A new Bayesian PMF method based on generalized Gaussian distribution called GBPMF is proposed in this paper. In the method, the generalized Gaussian distribution (GGD) is adopted as the prior distribution model in which some related parameters are adjusted automatically through machine learning to achieve desired effect. Meanwhile, we apply the Gibbs sampling algorithm to optimize the loss function. Considering the influence of the time difference of scoring in the prediction process, a temporal factor is integrated into the sampling algorithm to optimize the method and improve its prediction accuracy. The experimental results show that our methods GBPMF and GBPMF-T can obtain higher accuracy when dealing with both sparse matrix and non-sparse matrix, and the latter can even get better effect. When the matrix is very sparse, the accuracy of Bayesian PMF decreases sharply while our methods show stable performance.

Key words: personalized recommender systems, Bayesian PMF, machine learning, generalized Gaussian distribution (GGD), sparse matrix