ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2018, Vol. 55 ›› Issue (11): 2439-2451.doi: 10.7544/issn1000-1239.2018.20170496

Previous Articles     Next Articles

Domain Alignment Based on Multi-Viewpoint Domain-Shared Feature for Cross-Domain Sentiment Classification

Jia Xibin1,2, Jin Ya1,2, Chen Juncheng1   

  1. 1(北京工业大学信息学部 北京 100124); 2(多媒体与智能软件技术北京市重点实验室(北京工业大学) 北京 100124) (
  • Online:2018-11-01

Abstract: Plenty and well labeled training samples are significant foundation to make sure the good performance of supervising learning, whereas there is a problem of high labor-cost and time-consuming in the samples. Furthermore, it is not always feasible to get the plenty of well-labeled sample data in every application to support the classification training. Meanwhile, directly employing the trained model from the source domain to the target domain normally causes the problem of accuracy degradation, due to the information distribution discrepancy between the source domain and the target domain. Aiming to solve the above problems, we propose an algorithm named domain alignment based on multi-viewpoint domain-shared feature for cross-domain sentiment classification (DAMF). Firstly, we fuse three sentiment lexicons to eliminate the polarity divergence of domain-shared feature words that are chosen by mutual information value. On this basis, we extract the word pairs that have the same sentiment polarity in the same domain by utilizing four syntax rules and the word pairs that have strong association relation in the same domain by utilizing association rules algorithm. Then, we use the domain-shared words that have no polarity divergence as a bridge to establish an indirect mapping relationship between domain-specific words in different domains. By constructing the unified feature representation space of different domains, the domain alignment is achieved. Meanwhile, the experiments on four public data sets from Amazon product reviews corpora show the effectiveness of our proposed algorithm on cross-domain sentiment classification.

Key words: sentiment classification, cross-domain, polarity divergence, association rules, unified feature representation space, domain space alignment

CLC Number: