Abstract:
Multimodal sentiment analysis aims to utilize the multimodal customer comments and other data to identify users' sentimental tendencies. To realize cross-domain application with the domain bias, commonly used solutions are unsupervised domain adaptation methods. Nevertheless, this type of solutions focuses on the extraction of domain-invariant features, and it neglects the significance of domain-specific features at the target domain. Thus, a meta-optimization based domain-invariant and domain-specific feature disentanglement network is proposed. First, by embedding adapters into the pre-trained large model with fine-tuning fitting, the image-text fused sentiment feature encoder is accordingly constructed. Then, a feature disentanglement module is constructed on the basis of the factorization operation, which utilizes domain adversary and domain classification, together with collaborative independence constraints, respectively, to achieve knowledge-transferable domain-invariant feature embedding while extracting the domain-specific features to enhance the performance of sentiment classification at the target domain. To ensure the consistency of the overall optimization tendency for feature disentanglement and sentiment classification, a meta-learning-based meta-optimization training strategy is put forward to synergistically optimize the sentiment analysis network. Comparative experiments on bidirectional sentiment transfer tasks constructed by MVSA and Yelp datasets demonstrate that compared to other advanced image-text sentiment transfer algorithms, the proposed algorithm achieves superior performance on bidirectional sentiment transfer tasks in terms of three consensus metrics: Precision, Recall and
F1 score.