ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2020, Vol. 57 ›› Issue (7): 1369-1380.doi: 10.7544/issn1000-1239.2020.20190158

• 网络技术 • 上一篇    下一篇

Twitter社交网络用户行为理解及个性化服务推荐算法研究

于亚新,刘梦,张宏宇   

  1. (东北大学计算机科学与工程学院 沈阳 110169) (医学影像智能计算教育部重点实验室(东北大学) 沈阳 110169) (yuyx@mail.neu.edu.cn)
  • 出版日期: 2020-07-01
  • 基金资助: 
    国家自然科学基金项目(61871106,61973059);国家重点研发计划项目(2016YFC0101500)

Research on User Behavior Understanding and Personalized Service Recommendation Algorithm in Twitter Social Networks

Yu Yaxin, Liu Meng, Zhang Hongyu   

  1. (School of Computer Science and Engineering, Northeastern University, Shenyang 110169) (Key Laboratory of Intelligent Computing in Medical Image (Northeastern University), Ministry of Education, Shenyang 110169)
  • Online: 2020-07-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61871106, 61973059) and the National Key Research and Development Program of China (2016YFC0101500).

摘要: 随着社交网迅速发展,产生了大量带有时空信息的短文本数据.这些短文本数据因其文本长度过短且所带地理位置信息过于稀疏导致用户行为主题难于捕捉.此外,由于目前大多数用户行为理解相关研究工作缺少对行为要素间依赖关系的适度融合,因而造成行为理解具有片面性.基于此,首先提出2种综合考虑用户行为发生时间、活动内容、活动区域的用户-时间-活动模型(user-time-activity model, UTAM)和用户-时间-区域模型(user-time-region model, UTRM),用于深刻理解用户行为规律;然后利用LDA(latent Dirichlet allocation)技术,抽取用户活动-服务主题,提出活动-服务主题模型(activity-to-service topic model, ASTM),用于挖掘活动和服务间的对应关系;最后将服务地点属性内耦合性纳入考虑,提出了基于耦合和距离的矩阵分解(matrix factorization based on couple & distance, MFCD)算法,用于提高推荐质量.为验证所提模型和算法的有效性,在真实Twitter数据集上进行了扩展性实验,结果表明:所提模型对提高个性化服务推荐质量是有效的,MFCD算法对于用户的行为理解效果也优于传统矩阵分解算法.

关键词: 行为理解, 主题模型, 个性化服务推荐, 矩阵分解, 非独立同分布, 耦合相似性

Abstract: With the rapid development of social networks in recent years, a large amount of short text data with time-spacial information is produced accordingly. Due to short length of text and sparseness of geographic location, it is very difficult to capture the semantic topics of user behavior. In addition, most existing research work related to user behavior understanding has not taken the behavior elements dependency into account, which results in the incomplete understanding of user behavior. Based on these, two models mixed with time, activity and region, i.e., user-time-activity model (UTAM) and user-time-region model (UTRM), are proposed firstly in this paper so as to explore behavior principles effectively. And then, by extracting activity-service topics based on latent Dirichlet allocation (LDA) techniques, an activity-to-service topic model (ASTM) is proposed in order to mine corresponding relationships between activities and services. Finally, a novel matrix factorization algorithm fused with distance and coupled similarity, i.e., matrix factorization based on couple & distance (MFCD), is put forward to improve the recommendation quality. In order to verify the effectiveness of proposed models and algorithms, extensive experiments are executed on a real Twitter dataset. Experimental results show that the proposed models can improve the quality of personalized recommendation service greatly, and the performance of MFCD algorithm is superior to the traditional matrix factorization algorithm on the effect of understanding user behaviors.

Key words: behavior understanding, topic model, personalized service recommendation, matrix factorization, non-independent and identical distribution, coupling similarity

中图分类号: