ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2019, Vol. 56 ›› Issue (9): 1939-1952.doi: 10.7544/issn1000-1239.2019.20180624

• 信息处理 • 上一篇    下一篇

社交媒体内容可信性分析与评价

刘波1,3, 李洋2,3, 孟青1, 汤小虎1, 曹玖新2   

  1. 1(东南大学计算机科学与工程学院 南京 211189); 2(东南大学网络空间安全学院 南京 211189); 3(计算机网络与信息集成教育部重点实验室(东南大学) 南京 211189) (bliu@seu.edu.cn)
  • 出版日期: 2019-09-10
  • 基金资助: 
    国家重点研发计划项目(2017YFB1003000);国家自然科学基金项目(61370208,61472081,61320106007,61272531);国家“八六三”高技术研究发展计划基金项目(2013AA013503);江苏省网络与信息安全重点实验室基金项目(BM2003201);江苏省计算机网络技术重点实验室基金项目(BE2018706)

Evaluation of Content Credibility in Social Media

Liu Bo1,3, Li Yang2,3, Meng Qing1, Tang Xiaohu1, Cao Jiuxin2   

  1. 1(School of Computer Science and Engineering, Southeast University, Nanjing 211189); 2(School of Cyber Science and Engineering, Southeast University, Nanjing 211189); 3(Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing 211189)
  • Online: 2019-09-10
  • Supported by: 
    This work was supported by the National Key Research and Development Program of China (2017YFB1003000), the National Natural Science Foundation of China (61370208, 61472081, 61320106007, 61272531), the National High Technology Research and Development Program of China (863 Program) (2013AA013503), the Jiangsu Provincial Key Laboratory of Network and Information Security Foundation (BM2003201), and the Jiangsu Provincial Key Laboratory of Computer Network Technology Foundation (BE2018706).

摘要: 近年来社交媒体在拓宽人们获取信息渠道的同时,也方便了虚假信息的传播,并造成了严重的负面影响.与传统互联网媒体相比,社交媒体包含的信息更加复杂多样,为内容可信性的判断带来了新的挑战.已有研究在分析社交媒体内容可信性时,对挖掘可信性影响因素进行了很多工作,但缺乏对噪音数据的处理,大量的无用推文会对推文可信性判断造成干扰,进而会影响事件层面的可信性判断,从大量噪音数据中筛选出真正有用的推文数据就显得尤为重要.在推文层面同时考虑用户的主题因素和从众行为,减少了从众转发等噪音数据在可信性判断过程中的作用,对社交媒体内容的可信性进行研究,采用贝叶斯网络建立了社交媒体内容可信性评价模型,并通过新浪微博公开数据集验证了模型的有效性.

关键词: 社交媒体, 内容可信性, 主题因素, 从众行为, 概率图模型

Abstract: With the rapid development of social media in recent years, the access to information has been broadened, but the spreading of incredible information has been facilitated at the same time, which brings a series of negative impacts to cyber security. Compared with the traditional online media, the information in social media is more open and complicated, giving rise to great challenges to judge online information credibility for individuals. How to filter the incredible information becomes an urgent problem. In the existing research on the assessment of information credibility in social media, lots of effort has been involved in extracting the useful factors for credibility assessment, but the processing of noisy data is neglected, and a large number of useless tweets can be included in the evaluation process, resulting in the deviation of the information credibility assessment. So it is particularly important to select the significant tweets for information credibility assessment. This paper takes the topic factor and conformity of users into consideration to relieve the impact of noisy data, such as conformity retweeting, on information credibility assessment, and uses Bayesian network to establish an evaluation model for information credibility in social media. Then we verify the effectiveness of our model using a real dataset.

Key words: social media, content credibility, topic factor, conformity, probabilistic graphical model

中图分类号: