领域对齐对抗的无监督跨领域文本情感分析算法

贾熹滨; 曾檬; 米庆; 胡永利

doi:10.7544/issn1000-1239.20210039

领域对齐对抗的无监督跨领域文本情感分析算法

Domain Alignment Adversarial Unsupervised Cross-Domain Text Sentiment Analysis Algorithm

摘要

摘要: 在实际应用场景中，情感分析技术为自动判别文本情感极性提供了有效的决策及解决方案，但是文本情感分析技术依赖于大量的标定样本.为了减小对人工标注的依赖，有研究者提出了基于领域自适应的跨领域情感分析技术.该技术面向跨领域文本情感分析任务，将经由标定样本训练的源领域模型，迁移至无标定的目标领域.然而目前的领域自适应技术仅从单个角度进行迁移，即减小领域专有特征差异或提取领域不变特征.因此考虑到跨领域文本数据同时包含领域专有特征和领域不变特征的特点，提出了一种领域对齐对抗的无监督跨领域文本情感分析算法.该算法通过渐进式的迁移策略，逐层减小不同语义层的领域差异，并在高层语义子空间通过协同优化的领域自适应算法，实现跨领域文本数据的领域知识迁移.在2个公开跨领域文本情感数据集上的24组跨领域文本情感分类实验结果表明，与4类领域自适应算法中代表性的和当前表现最优的方法相比，领域对齐对抗的无监督跨领域文本情感分析算法在24组实验中取得了最高的平均分类准确率，同时结合迁移性能分析结果和特征分布可视化结果，证明该算法一定程度上提升了现有无监督跨领域文本情感分析算法的分类性能和迁移性能.

Abstract: Sentiment analysis technique can help make effective decisions and solutions by automatically discriminating the sentiment polarity in a practical application scene. However, it requires a large amount of annotated samples. To reduce the dependence on manual annotation, some researchers propose the domain adaptation based cross-domain sentiment analysis methods, which transfer a source domain model trained on an adequately labeled dataset to an unlabeled target domain. However, existing domain adaptation methods transfer from only one angle, namely, reducing the discrepancy of domain-specific features or simply extracting the domain-invariant features. To make use of domain-specific features and domain-invariant features together, we propose an unsupervised domain adaptation sentiment analysis algorithm in this paper for unsupervised cross-domain sentiment classification tasks. The algorithm reduces the domain discrepancy on different semantic layers with a progressive transfer strategy, and adopts the synergistic optimization of domain adaptation algorithm in high-level semantic subspace to transfer the domain knowledge of cross-domain text data. We validate our algorithm on 2 public review datasets with 24 cross-domain sentiment classification tasks. It is compared with 4 types of domain adaptation algorithms. The results show that our algorithm achieves the highest average accuracy. Moreover, it has better performance than the existing unsupervised cross-domain text sentiment classification algorithms in terms of the performance of classification and transferring.

HTML全文

参考文献(0)

施引文献

资源附件(0)