ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (9): 1972-1986.doi: 10.7544/issn1000-1239.2018.20180155

所属专题: 2018优青专题

• 信息处理 • 上一篇    下一篇

一种基于社交事件关联的故事脉络生成方法

李莹莹1,2,马帅1,2,蒋浩谊1,2,刘喆2,胡春明1,2,李雄3   

  1. 1(软件开发环境国家重点实验室(北京航空航天大学) 北京 100191); 2(北京大数据科学与脑机智能高精尖创新中心(北京航空航天大学) 北京 100191); 3(国家计算机网络应急技术处理协调中心 北京 100029) (liyy@act.buaa.edu.cn)
  • 出版日期: 2018-09-01
  • 基金资助: 
    国家自然科学基金项目(U1636210&61421003);国家自然科学基金优秀青年科学基金项目(61322207) This work was supported by the National Natural Science Foundation of China (U1636210&61421003) and the National Natural Science Foundation of China for Excellent Young Scientists (61322207).

An Approach for Storytelling by Correlating Events from Social Networks

Li Yingying1,2, Ma Shuai1,2, Jiang Haoyi1,2, Liu Zhe2, Hu Chunming1,2, Li Xiong3   

  1. 1(State Key Laboratory of Software Development Environment (Beihang University), Beijing 100191); 2(Beijing Advanced Innovation Center for Big Data and Brain Computing (Beihang University), Beijing 100191); 3(National Computer Network Emergency Response Technical TeamCoordination Center of China, Beijing 100029)
  • Online: 2018-09-01

摘要: 推特和新浪微博等社交网络已成为报道公共事件的重要平台,它们为监控事件及其演化提供了宝贵的数据.然而,这些数据包含的非正式词语和碎片化文本使得从中提取描述性的信息具有一定的挑战.另外,从快速生成的大量微博监控事件演化也有一定难度.提出在社交网络中监控事件并对具有相同主题的事件演化进行分析.这既可以在粗粒度水平获得事件的概述,又可以在细粒度水平获得事件的详细信息.通过3个连续的组件实现该任务.1)用结构化的方法从微博检测事件;2)基于事件的隐式语义信息对事件聚类并将聚类获得的簇定义为故事;3)用基于图的方法为每个故事生成故事脉络,故事脉络用包含摘要的有向无环图表示故事内事件的演化.用户体验评估实验表明:提出的方法比现有方法具有更高的准确性和可理解性,并能够帮助用户监控事件及其演化.

关键词: 社交网络, 事件演化, 故事脉络, 聚类, 主题模型

Abstract: Social networks, such as Twitter and Sina weibo, have become popular platforms to report the public event. They provide valuable data for us to monitor events and their evolution. However, informal words and fragmented texts make it challenging to extract descriptive information. Monitoring the event progression from fast accumulation of microblogs is also difficult. To this end, we monitor the event progression with a common topic from the social network. This can help us to gain an overview and a detailed documentation of the events. In this paper, we use three consecutive components to meet this end. First, we use a structure based approach to detect events from the microblog dataset. Second, we cluster the events by their topics based on their latent semantic information, and define each cluster as a story. Third, we use a graph based approach to generate a storyline for each story. The storyline is denoted by a directed acyclic graph (DAG) with a summary to express the progression of events in the story. The user experience evaluation indicates that this method can help us to monitor events and their progression by achieving improved accuracy and comprehension compared with the state of art methods.

Key words: social network, event progression, storyline, cluster, topic model

中图分类号: