Abstract:
Due to the combination of image and text can better reflect the users’ attitude and standpoint, image-text sentiment analysis has become a research hotspot. However, the existing sentiment analysis methods cannot extract and fuse image-text emotion information effectively, which results in low performance, large amount of parameters, and difficulty in deployment. In this paper, a lightweight image-text sentiment analysis model using public emotion feature compression and fusion is proposed. This model designs the image and text feature compression module by combining the convolution layer and fully connected layer to extract and compress the feature for reducing the feature dimension simultaneously. In addition, a public emotion feature fusion module based on the gating mechanism is proposed to eliminate the heterogeneity of image-text features through mapping the image and text features to the same emotional space and reduce the redundant information by extracting and fusing the public emotion features of image-text. Experimental results on 3 baseline datasets of Twitter, Flickr, and Getty Images show that the proposed model can extract and fuse the emotional information of image-text more effectively than the early models. Compared with the latest models, the proposed model greatly reduces model parameters and has better performance, and is easier to be deployed.