ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (1): 188-197.doi: 10.7544/issn1000-1239.2018.20160892

• 人工智能 • 上一篇    下一篇

基于主题增强卷积神经网络的用户兴趣识别

杜雨萌,张伟男,刘挺   

  1. (哈尔滨工业大学社会计算与信息检索研究中心 哈尔滨 150001) (ymdu@ir.hit.edu.cn)
  • 出版日期: 2018-01-01
  • 基金资助: 
    国家“九七三”重点基础研究发展计划基金项目(2014CB340503);国家自然科学基金项目(61472107,61502120)

Topic Augmented Convolutional Neural Network for User Interest Recognition

Du Yumeng, Zhang Weinan, Liu Ting   

  1. (Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, Harbin 150001)
  • Online: 2018-01-01

摘要: 提出了一种基于主题增强卷积神经网络的用户兴趣识别的方法,通过构造一个双通道CNN模型,融合连续语义信息和离散主题信息,获取用户微博类别分布,在此基础上,通过极大似然估计识别用户的兴趣.实验结果表明,相较于基于Labeled LDA主题模型的方法和传统卷积神经网络的方法,提出的主题增强卷积神经网络缓解了噪声词对用户兴趣词的影响,并且通过融入主题信息提高了对于包含噪声词较多的微博的分类效果,在微博分类及用户兴趣识别上的效果获得了显著的提升.

关键词: 主题模型, 卷积神经网络, 微博分类, 用户兴趣识别, 微博

Abstract: With the development of mobile Internet technology and the popularity of mobile terminals, there have been many social websites and applications on the Internet. As a social application, microblog has attracted a large number of users, with its convenience of operation and rapid propagation. A user receiving hundreds of microblogs every day, which leads to the situation of information overload, increases the difficulty of the user’s information and knowledge acquisition. On the other hand, more and more merchants treat microblog as a marketing platform, which makes the advertisements directed delivery become a problem with highly commercial value. Microblog user interest recognition can contribute to solve the problems discussed above. This paper proposes a topic augmented convolutional neural network approach to recognize user interest. By integrating the continuous semantic information and the discrete topic information, the proposed approach first obtains the category distribution of users’ microblogs. It then recognizes users’ interest through the maximum likelihood estimation over the category distribution of users’ microblogs. Experimental results show that the proposed topic augmented convolutional neural network approach outperforms the labeled LDA based approach and the traditional convolutional neural network approach significantly on the microblog classification and user interest recognition.

Key words: topic model, convolutional neural network (CNN), microblog classification, user interest recognition, microblog

中图分类号: