基于主题与情感联合预训练的虚假评论检测方法

张东杰; 黄龙涛; 张荣; 薛晖; 林俊宇; 路瑶

doi:10.7544/issn1000-1239.2021.20200817

基于主题与情感联合预训练的虚假评论检测方法

¹(阿里巴巴集团北京 100102)
²(中国科学院信息工程研究所北京 100093)
³(廊坊职业技术学院河北廊坊 065001) (yurui.zdj@alibaba-inc.com)

基金项目: 廊坊市科技支撑计划项目(2020011005)

详细信息

中图分类号: TP399
计量
- 文章访问数: 691
- HTML全文浏览量: 4
- PDF下载量: 466
出版历程
- 发布日期: 2021-06-30

Fake Review Detection Based on Joint Topic and Sentiment Pre-Training Model

¹(Alibaba Group, Beijing 100102)
²(Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093)
³(Langfang Polytechnic Institute, Langfang, Hebei 065001)

Funds: This work was supported by the Key Technology Research and Development Program of Langfang (2020011005).

摘要

摘要: 商品评论信息是用户线上决策的重要依据，但在利益的驱使下商家往往会通过雇佣专业的写手撰写大量虚假评论的方式来误导用户，进而达到包装自己或诋毁竞争对手的目的.这种现象会造成不正当的商业竞争和极差的用户体验.针对这一现象，我们通过情感预训练的方法对现有的虚假评论识别模型进行了改进，并提出了一种能够同时整合评论语义和情感信息的联合预训练学习方法.鉴于预训练模型强大的语义表示能力, 在联合学习框架中采用了2种预训练模型编码器分别用于抽取评论的语义和情感上下文特征，并通过联合训练的方法整合2种特征，最后使用Center Loss损失函数对模型进行优化.在多个公开数据集和多个不同任务上进行了验证实验，实验表明提出的联合模型在虚假评论检测与情感极性分析任务上都取得了目前最好的效果且具有更强的泛化能力.
- 虚假评论检测 /
- 预训练模型 /
- 情感分析 /
- 联合训练 /
- Center Loss
Abstract: Product review information is an important basis for users’ online decision-making. However, driven by profit, businesses often hire professional writers to write a large number of fake reviews to mislead users and achieve the purpose of packaging themselves and denigrating competitors, resulting in unfair business competition and extremely poor user experience. In response to this phenomenon, we improved the existing spam review recognition methods through Pre-training Models, and proposed a joint pre-training learning method that can simultaneously integrate the semantic and sentimental information of product reviews. In view of the powerful semantic representation capabilities of the pre-trained model, we apply two pre-trained encoders to extract the semantic and emotional features of reviews in the joint learning framework. We integrate the two types of features through joint pre-training learning method. Apart from that, we add the Center Loss function to optimize the model. We have conducted several verification experiments on multiple public data sets and multiple different tasks. The experiments show that our proposed joint model has achieved the best results and has a stronger generalization in both fake review detection and sentiment analysis tasks.
- fake review detection /
- pre-training model /
- sentiment analysis /
- joint learning framework /
- Center Loss

HTML全文

参考文献(0)

施引文献(25)

期刊类型引用(10)

1.	杜金明，孙媛媛，林鸿飞，杨亮. 融入知识图谱和课程学习的对话情绪识别. 计算机研究与发展. 2024(05): 1299-1309 . 本站查看
2.	纪鑫，武同心，王宏刚，杨智伟，何禹德，赵晓龙. 基于多通道图神经网络的属性聚合式实体对齐. 北京航空航天大学学报. 2024(09): 2791-2799 . 百度学术
3.	陈富强，寇嘉敏，苏利敏，李克. 基于图神经网络的多信息优化实体对齐模型. 计算机科学. 2023(03): 34-41 . 百度学术
4.	刘璐，飞龙，高光来. 基于多视图知识表示和神经网络的旅游领域实体对齐方法. 计算机应用研究. 2023(04): 1044-1051 . 百度学术
5.	安靖，司光亚，周杰，韩旭. 基于知识图谱的仿真想定智能生成方法. 指挥与控制学报. 2023(01): 103-109 . 百度学术
6.	孙泽群，崔员宁，胡伟. 基于链接实体回放的多源知识图谱终身表示学习. 软件学报. 2023(10): 4501-4517 . 百度学术
7.	时慧芳. 融合高速路门机制的跨语言实体对齐研究. 现代电子技术. 2023(20): 167-172 . 百度学术
8.	张富，杨琳艳，李健伟，程经纬. 实体对齐研究综述. 计算机学报. 2022(06): 1195-1225 . 百度学术
9.	姜亚莉，戴齐，刘捷. 基于交叉图匹配和双向自适应迭代的实体对齐. 信息与电脑(理论版). 2022(20): 201-204 . 百度学术
10.	王小鹏. 基于知识图谱的择优分段迭代式实体对齐方法研究. 信息与电脑(理论版). 2021(18): 48-52 . 百度学术