高级检索

    面向跨领域情感分类的统一框架

    A Unified Framework for Cross-Domain Sentiment Classification

    • 摘要: 文本的情感分类问题,即判断文本中的论断是持支持态度还是反对态度.已有的研究表明,监督分类方法对情感分类很有效.但是多数情况下,已有的标注数据与待判断情感类别的数据不属于同一个领域,此时监督分类算法的性能明显下降,由此产生的即为跨领域情感分类问题.为解决此问题,提出一个统一框架,分多阶段进行跨领域情感分类:首先利用训练域文本的准确标签来得到测试域文本的初始标签;然后将测试域建成一个加权网络,将一些较准确的测试文本作为“源点”和“汇点”,进一步利用热传导思想迭代进行跨领域情感分类.实验结果表明,此方法能大幅度提高跨领域情感分类的精度.

       

      Abstract: Sentiment classification of documents aims to determine the opinion (e.g., negative or positive) of a given document. Existing studies have shown that, usually, supervised classification approaches perform well in sentiment classification. However, in most cases, the existing labeled data and the unlabeled data don’t belong to the same domain. And the performance of sentiment classification decreases sharply when transferred from one domain to another domain. This causes cross-domain sentiment classification, which is a very significant problem and getting more and more attention. A unified framework is proposed, which integrates several stages for cross-domain sentiment classification. Firstly, we utilize the accurate labels of source-domain documents to get the initial labels of target-domain documents. Then, we build the target domain as a weighted network, and choose some target-domain documents whose opinions are determined more accurately as “source components” and “sink components”. Further, we apply heat conduction process to the weighted network to improve the performance of cross-domain sentiment classification of target-domain data, with the help of “source components” and “sink components”. An experiment is conducted using data from three different domains, and we transfer between two of them. The experiment results indicate that the proposed framework could improve the performance of cross-domain sentiment classification dramatically.

       

    /

    返回文章
    返回