高级检索

    基于随机游走模型的跨领域倾向性分析研究

    CrossDomain Opinion Analysis Based on RandomWalk Model

    • 摘要: 近年来,研究者们已经在跨领域倾向性分析方面取得了一些进展.然而,现有的方法和系统往往只根据已标注文本或者已标注情感词对目标领域文本进行倾向性分析,却缺乏一个统一的模型框架将文本与情感词之间全部知识进行有机的融合.提出了一种基于随机游走模型的跨领域倾向性分析方法,该模型能够同时利用源领域和目标领域文本与词之间的所有关系来对文本与词进行互相增强,旨在将文本之间的关系、词之间的关系、文本与词之间的相互关系集成到一个完整的理论框架中.实验结果表明,提出的算法能大幅度提高跨领域倾向性分析的精度.

       

      Abstract: Nowadays, more and more people express their opinions on products, books, movies, etc. at review sites, forums, discussion groups, blogs and so on. Determining the opinion of a given document from Web (that is, opinion analysis) has drawn much attention. To guarantee the accuracy of opinion analysis, many methods for opinion analysis require abundant labeled data. But the labeled data in different domains are very imbalanced. So in recent years, some studies have been conducted to deal with cross-domain opinion analysis problems. However, most of the attempts rely on only the labeled documents or the labeled sentiment words, so this kind of methods fail to uncover the full knowledge between the documents and the sentiment words. This paper proposes an approach for cross-domain opinion analysis based on random-walk model by simultaneously utilizing documents and words from both source domain and target domain. The approach can make full use of the mutual reinforcement between documents and words by fusing four kinds of relationships between documents and words, that is, the relationships between documents, the relationships between words, the relationships between words and documents, and the relationships between documents and words. Experimental results indicate that the proposed algorithm could improve the performance of cross-domain opinion analysis dramatically. The average accuracy of the proposed approach is about 15% higher than traditional classifiers, and about 7% higher than the state-of-the-art method.

       

    /

    返回文章
    返回