ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (3): 629-638.doi: 10.7544/issn1000-1239.2015.20140156

Previous Articles     Next Articles

Cross-Domain Text Sentiment Classification Based on Grouping-AdaBoost Ensemble

Zhao Chuanjun1, Wang Suge1,2, Li Deyu1,2, Li Xin1   

  1. 1(School of Computer and Information Technology, Shanxi University, Taiyuan 030006); 2(Key Laboratory of Computational Intelligence and Chinese Information Processing (Shanxi University), Ministry of Education, Taiyuan 030006)
  • Online:2015-03-01

Abstract: In the cross-domain sentiment classification, the labeled data in the target domain is often scarce and precious. To solve this problem, this paper proposes a grouping-AdaBoost ensemble classifier method by comprehensively using the strategies and techniques of semi-supervised learning, Bootstrapping, data grouping, AdaBoost, ensemble learning. Firstly, we adopt a small amount of labeled data in the target domain to generate a number of virtual data by using synthetic minority over-sampling technique. On this basis, we can obtain more data with high credibility label in the target domain by using Bootstrapping method. In the aspect of classifier construction, we firstly make an equivalent quantity partition to the labeled data in the source domain, and combine each part with the labeled data in the target domain to form the corresponding combined data sets. Corresponding to each combined data set, a classifier is trained, and it is then promoted by AdaBoost method. At last, these classifiers corresponding to the combined data sets are linearly integrated into an ensemble classifier. The experimental results on four data sets from Amazon online shopping reviews corpora indicate that the proposed method can improve the accuracy of cross-domain sentiment transformation effectively.

Key words: sentiment classification, cross-domain, synthetic minority over-sampling technique, grouping-AdaBoost, ensemble classifier

CLC Number: