ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2016, Vol. 53 ›› Issue (8): 1673-1683.doi: 10.7544/issn1000-1239.2016.20160103

所属专题: 2016数据挖掘前沿技术专题

• 人工智能 • 上一篇    下一篇

用户在线购买预测:一种基于用户操作序列和选择模型的方法

曾宪宇,刘淇,赵洪科,徐童,王怡君,陈恩红   

  1. (中国科学技术大学计算机学院 合肥 230027) (zengxy@mail.ustc.edu.cn)
  • 出版日期: 2016-08-01
  • 基金资助: 
    国家杰出青年科学基金项目(61325010);国家自然科学基金项目(61403358);科技惠民计划项目(2013GS340302);青年创新促进会会员专项基金项目(2014299);多媒体计算与通信教育部-微软重点实验室基金项目

Online Consumptions Prediction via Modeling User Behaviors and Choices

Zeng Xianyu, Liu Qi, Zhao Hongke, Xu Tong, Wang Yijun,Chen Enhong   

  1. (School of Computer Science, University of Science and Technology of China, Hefei 230027)
  • Online: 2016-08-01

摘要: 电商网站的兴起与用户在线购物习惯的形成,带来了海量的在线消费行为数据.如何从这些行为数据(如点击数据)中建模用户对相似产品的比较和选择过程,进而准确预测用户的兴趣偏好和购买行为,对于提高产品的购买转化率具有重要意义.针对这一问题,提出了基于用户行为序列数据和选择模型的在线购买预测解决方案.具体而言,1)使用行为序列效用函数估计用户在购买周期(session)中的最佳替代商品,然后对购买商品和最佳替代商品建立基于潜在因子的选择模型(latent factor based choice model, LF-CM),从而得到用户的购买偏好,实现对用户购买行为的预测.更进一步,为了充分地利用用户在每个购买周期的所有选择和比较信息,提高预测精度;2)提出了一种可以作用于购买周期内所有商品的排序学习模型(latent factor and sequence based choice model, LFS-CM),它通过融合潜在因子和行为序列的效用函数,提高了购买预测的精度;3)使用大规模真实数据集在分布式环境下进行了实验,并与参照算法进行了对比,证实了所提出的2个方法在用户在线购买预测上的有效性.

关键词: 在线购买预测, 选择模型, 行为序列, 序列效用, 分布式平台

Abstract: The rise of electronic e-commerce sites and the formation of the user’s online shopping habits, have brought a huge amount of online consumer behavioral data. Mining users’ preferences from these behavioral logs (e.g. clicking data) and then predicting their final consumption choices are of great importance for improving the conversion rate of e-commerce. Along this line, this paper proposes a way of combining users’ behavioral data and choice model to predict which item each user will finally consume. Specifically, we first estimate the optimum substitute in each consumption session by a utility function of users’ behavioral sequences, and then we build a latent factor based choice model (LF-CM) for the consumed items and the substitutes. In this way, the preference of users can be computed and the future consumptions can be predicted. One step further, to make full use of users’ information of choosing and improve the precision of consumption prediction, we also propose a learning-to-rank model (latent factor and sequence based choice model, LFS-CM), which considers all the items in one session. By integrating latent factors and utility function of users’ behavioral sequences, LFS-CM can improve the prediction precision. Finally, we use the real-world dataset of Tmall and evaluate the performance of our methods on a distributed environment. The experimental results show that both LF-CM and LFS-CM perform well in predicting online consumption behaviors.

Key words: online consumption prediction, choice model, behavioral sequence, sequence utility, distributed platform

中图分类号: