ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2021, Vol. 58 ›› Issue (7): 1385-1394.doi: 10.7544/issn1000-1239.2021.20200817

Special Issue: 2021虚假信息检测专题

Previous Articles     Next Articles

Fake Review Detection Based on Joint Topic and Sentiment Pre-Training Model

Zhang Dongjie1, Huang Longtao1, Zhang Rong1, Xue Hui1, Lin Junyu2, Lu Yao3   

  1. 1(Alibaba Group, Beijing 100102);2(Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093);3(Langfang Polytechnic Institute, Langfang, Hebei 065001)
  • Online:2021-07-01
  • Supported by: 
    This work was supported by the Key Technology Research and Development Program of Langfang (2020011005).

Abstract: Product review information is an important basis for users’ online decision-making. However, driven by profit, businesses often hire professional writers to write a large number of fake reviews to mislead users and achieve the purpose of packaging themselves and denigrating competitors, resulting in unfair business competition and extremely poor user experience. In response to this phenomenon, we improved the existing spam review recognition methods through Pre-training Models, and proposed a joint pre-training learning method that can simultaneously integrate the semantic and sentimental information of product reviews. In view of the powerful semantic representation capabilities of the pre-trained model, we apply two pre-trained encoders to extract the semantic and emotional features of reviews in the joint learning framework. We integrate the two types of features through joint pre-training learning method. Apart from that, we add the Center Loss function to optimize the model. We have conducted several verification experiments on multiple public data sets and multiple different tasks. The experiments show that our proposed joint model has achieved the best results and has a stronger generalization in both fake review detection and sentiment analysis tasks.

Key words: fake review detection, pre-training model, sentiment analysis, joint learning framework, Center Loss

CLC Number: