Abstract:
Sentiment classification of documents aims to determine the opinion (e.g., negative or positive) of a given document. Existing studies have shown that, usually, supervised classification approaches perform well in sentiment classification. However, in most cases, the existing labeled data and the unlabeled data don’t belong to the same domain. And the performance of sentiment classification decreases sharply when transferred from one domain to another domain. This causes cross-domain sentiment classification, which is a very significant problem and getting more and more attention. A unified framework is proposed, which integrates several stages for cross-domain sentiment classification. Firstly, we utilize the accurate labels of source-domain documents to get the initial labels of target-domain documents. Then, we build the target domain as a weighted network, and choose some target-domain documents whose opinions are determined more accurately as “source components” and “sink components”. Further, we apply heat conduction process to the weighted network to improve the performance of cross-domain sentiment classification of target-domain data, with the help of “source components” and “sink components”. An experiment is conducted using data from three different domains, and we transfer between two of them. The experiment results indicate that the proposed framework could improve the performance of cross-domain sentiment classification dramatically.