ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (8): 1806-1816.doi: 10.7544/issn1000-1239.2015.20150253

所属专题: 2015面向大数据的人工智能技术

• 人工智能 • 上一篇    下一篇

否定句的情感不确定性度量及分类

张志飞1,2,苗夺谦1,聂建云2,岳晓冬3   

  1. 1(同济大学计算机科学与技术系 上海 201804); 2(加拿大蒙特利尔大学计算机科学与运筹学系 蒙特利尔 H3C3J7); 3(上海大学计算机工程与科学学院 上海 200444)(tjzhifei@163.com)
  • 出版日期: 2015-08-01
  • 基金资助: 
    基金项目:国家自然科学基金项目(61273304,61202170);高等学校博士学科点专项科研基金项目(20130072130004)

Sentiment Uncertainty Measure and Classification of Negative Sentences

Zhang Zhifei1,2, Miao Duoqian1, Nie Jianyun2,Yue Xiaodong3   

  1. 1(Department of Computer Science and Technology, Tongji University, Shanghai 201804); 2(Department of Computer Science and Operations Research, University of Montreal, Montreal H3C 3J7); 3(School of Computer Engineering and Science, Shanghai University, Shanghai 200444)
  • Online: 2015-08-01

摘要: 情感分类是社交媒体大数据分析的有力手段之一.否定句作为一种普遍且特殊的句子现象,其情感分类的研究具有重要的意义.否定词语和情感词语在否定句情感分类中同样重要,已有方法仅仅考虑否定词语修饰情感词语的情况,忽视否定词语本身反映情感的作用.为了统一解决否定词语修饰和不修饰情感词语情况下的分类问题,提出了基于决策粗糙集的否定句情感分类模型.构造词典并结合句际关系计算子句情感值,根据子句情感值提出基于KL散度的句子情感不确定性度量方法;然后融合多个特征,特别是与否定相关的独立否定特征和显著副词特征,用于否定句的特征表示;最后提出基于决策相关程度的决策正域约简算法,生成否定句情感分类决策规则.实验结果验证了该模型的有效性以及情感不确定性度量对于情感分类的作用.

关键词: 情感分类, 否定句, 情感不确定性, 决策粗糙集, 属性约简

Abstract: Sentiment classification is a powerful technology for social media big data analysis. It is of great importance to predict the sentiment polarity of a sentence, especially a negative sentence that is often used. The negation words and sentiment words play equally important roles in the sentiment classification of negative sentences. A negation word is important when it modifies a sentiment word; but it can also have sentimental implication on its own. The existing methods only consider the negation words when they modify sentiment words. In this paper, a unified classification model based on decision-theoretic rough sets is proposed to deal with the sentiment classification of negative sentences. First, the sentiment value of each clause in a sentence is calculated by several lexicons and the inter-sentence relations. A novel measure of sentiment uncertainty for a sentence is given based on Kullback-Leibler divergence. Then, the negative sentences are represented in terms of four features (initial polarity, sentiment uncertainty, successive punctuations, and sentence type) and especially two negation-related features: single negation and salient adverb. Finally, a novel attribute reduction algorithm based on the decision correlation degree is used to generate the decision rules for sentiment classification of negative sentences. The experimental results show that this model is effective and the sentiment uncertainty measure is helpful to sentiment classification.

Key words: sentiment classification, negative sentences, sentiment uncertainty, decision-theoretic rough sets, attribute reduction

中图分类号: