Citation: | Gan Chenquan, Fu Xiang, Feng Qingdong, Zhu Qingyi. A Lightweight Image-Text Sentiment Analysis Model Based on Public Emotion Feature Compression and Fusion[J]. Journal of Computer Research and Development, 2023, 60(5): 1099-1110. DOI: 10.7544/issn1000-1239.202111218 |
Due to the combination of image and text can better reflect the users’ attitude and standpoint, image-text sentiment analysis has become a research hotspot. However, the existing sentiment analysis methods cannot extract and fuse image-text emotion information effectively, which results in low performance, large amount of parameters, and difficulty in deployment. In this paper, a lightweight image-text sentiment analysis model using public emotion feature compression and fusion is proposed. This model designs the image and text feature compression module by combining the convolution layer and fully connected layer to extract and compress the feature for reducing the feature dimension simultaneously. In addition, a public emotion feature fusion module based on the gating mechanism is proposed to eliminate the heterogeneity of image-text features through mapping the image and text features to the same emotional space and reduce the redundant information by extracting and fusing the public emotion features of image-text. Experimental results on 3 baseline datasets of Twitter, Flickr, and Getty Images show that the proposed model can extract and fuse the emotional information of image-text more effectively than the early models. Compared with the latest models, the proposed model greatly reduces model parameters and has better performance, and is easier to be deployed.
[1] |
Li Zuhe, Fan Yangyu, Jiang Bin, et al. A survey on sentiment analysis and opinion mining for social multimedia[J]. Multimedia Tools and Applications, 2019, 78(6): 6939−6967 doi: 10.1007/s11042-018-6445-z
|
[2] |
Bouko C. Emotions through texts and images: A multimodal analysis of reactions to the Brexit vote on Flickr[J]. Pragmatics, 2020, 30(2): 222−246 doi: 10.1075/prag.18060.bou
|
[3] |
Asur S, Huberman B A. Predicting the future with social media[C]//Proc of the 9th IEEE/WIC/ACM Int Conf on Web Intelligence and Intelligent Agent Technology. Piscataway, NJ: IEEE, 2010: 492−499
|
[4] |
吴璠,王中卿,周夏冰,等. 基于用户和产品表示的情感分析和评论质量检测联合模型[J]. 软件学报,2020,31(8):2492−2507 doi: 10.13328/j.cnki.jos.005895
Wu Fan, Wang Zhongqing, Zhou Xiabing, et al. Joint model for sentiment analysis and review quality detection with user and product representations[J]. Journal of Software, 2020, 31(8): 2492−2507 (in Chinese) doi: 10.13328/j.cnki.jos.005895
|
[5] |
张宜浩,朱小飞,徐传运,等. 基于用户评论的深度情感分析和多视图协同融合的混合推荐方法[J]. 计算机学报,2019,42(6):1316−1333 doi: 10.11897/SP.J.1016.2019.01316
Zhang Yihao, Zhu Xiaofei, Xu Chuanyun, et al. Hybrid recommendation approach based on deep sentiment analysis of user reviews and multi-view collaborative fusion[J]. Chinese Journal of Computers, 2019, 42(6): 1316−1333 (in Chinese) doi: 10.11897/SP.J.1016.2019.01316
|
[6] |
Maas A L, Daly R E, Pham P T, et al. Learning word vectors for sentiment analysis[C]//Proc of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2011: 142−150
|
[7] |
Atzeni M, Recupero D R. Multi-domain sentiment analysis with mimicked and polarized word embeddings for human-robot interaction[J]. Future Generation Computer Systems, 2020, 110: 984−999 doi: 10.1016/j.future.2019.10.012
|
[8] |
程艳,尧磊波,张光河,等. 基于注意力机制的多通道CNN和BiGRU的文本情感倾向性分析[J]. 计算机研究与发展,2020,57(12):2583−2595 doi: 10.7544/issn1000-1239.2020.20190854
Cheng Yan, Yao Leibo, Zhang Guanghe, et al. Text sentiment orientation analysis of multi-channels CNN and BiGRU based on attention mechanism[J]. Journal of Computer Research and Development, 2020, 57(12): 2583−2595 (in Chinese) doi: 10.7544/issn1000-1239.2020.20190854
|
[9] |
Basiri M E, Nemati S, Abdar M, et al. ABCDM: An attention-based bidirectional CNN-RNN deep model for sentiment analysis[J]. Future Generation Computer Systems, 2021, 115: 279−294 doi: 10.1016/j.future.2020.08.005
|
[10] |
刘金硕,冯阔,Pan J Z,等. MSRD:多模态网络谣言检测方法[J]. 计算机研究与发展,2020,57(11):2328−2336 doi: 10.7544/issn1000-1239.2020.20200413
Liu Jinshuo, Feng Kuo, Pan J Z, et al. MSRD: Multi-modal Web rumor detection method[J]. Journal of Computer Research and Development, 2020, 57(11): 2328−2336 (in Chinese) doi: 10.7544/issn1000-1239.2020.20200413
|
[11] |
Liu Ningning, Dellandréa E, Chen Liming, et al. Multimodal recognition of visual concepts using histograms of textual concepts and selective weighted late fusion scheme[J]. Computer Vision and Image Understanding, 2013, 117(5): 493−512 doi: 10.1016/j.cviu.2012.10.009
|
[12] |
Gaspar A, Alexandre L A. A multimodal approach to image sentiment analysis[C]//Proc of the 20th Int Conf on Intelligent Data Engineering and Automated Learning. Berlin: Springer, 2019: 302−309
|
[13] |
You Quanzeng, Luo Jiebo, Jin Hailin, et al. Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia[C]//Proc of the 9th ACM Int Conf on Web Search and Data Mining. New York: ACM, 2016: 13−22
|
[14] |
Felicetti A, Martini M, Paolanti M, et al. Visual and textual sentiment analysis of daily news social media images by deep learning[C]//Proc of the 20th Int Conf on Image Analysis and Processing. Berlin: Springer, 2019: 477−487
|
[15] |
Huang Feiran, Zhang Xiaoming, Zhao Zhonghua, et al. Image–text sentiment analysis via deep multimodal attentive fusion[J]. Knowledge-Based Systems, 2019, 167: 26−37 doi: 10.1016/j.knosys.2019.01.019
|
[16] |
Huang Feiran, Wei Kaimin, Weng Jian, et al. Attention-based modality-gated networks for image-text sentiment analysis[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2020, 16(3): 1−19
|
[17] |
Zhang Ke, Zhu Yunwen, Zhang Wenjun, et al. Cross-modal image sentiment analysis via deep correlation of textual semantic[J]. Knowledge-Based Systems, 2021, 216: 106803
|
[18] |
Xu Jie, Huang Feiran, Zhang Xiaoming, et al. Visual-textual sentiment classification with bi-directional multi-level attention networks[J]. Knowledge-Based Systems, 2019, 178: 61−73 doi: 10.1016/j.knosys.2019.04.018
|
[19] |
Yang Xiaocui, Feng Shi, Zhang Yifei, et al. Multimodal sentiment detection based on multi-channel graph neural networks[C]//Proc of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int Joint Conf on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2021: 328−339
|
[20] |
李霞,卢官明,闫静杰,等. 多模态维度情感预测综述[J]. 自动化学报,2018,44(12):2142−2159 doi: 10.16383/j.aas.2018.c170644
Li Xia, Lu Guanming, Yan Jingjie, et al. A survey of dimensional emotion prediction by multimodal cues[J]. Acta Automatica Sinica, 2018, 44(12): 2142−2159 (in Chinese) doi: 10.16383/j.aas.2018.c170644
|
[21] |
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint, arXiv: 1409.1556, 2015
|
[22] |
He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep residual learning for image recognition[C]//Proc of the IEEE Conf on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2016: 770−778
|
[23] |
闫小强,叶阳东. 共享和私有信息最大化的跨媒体聚类[J]. 计算机研究与发展,2019,56(7):1370−1382 doi: 10.7544/issn1000-1239.2019.20180470
Yan Xiaoqiang, Ye Yangdong. Cross-media clustering by share and private information maximization[J]. Journal of Computer Research and Development, 2019, 56(7): 1370−1382 (in Chinese) doi: 10.7544/issn1000-1239.2019.20180470
|
[24] |
Wu Yang, Lin Zijie, Zhao Yanyan, et al. A text-centered shared-private framework via cross-modal prediction for multimodal sentiment analysis[C]//Proc of the 59th Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Stroudsburg, PA: ACL, 2021: 4730−4738
|
[25] |
Greff K, Srivastava R K, Koutník J, et al. LSTM: A search space odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 28(10): 2222−2232
|
[26] |
Yang Zhengling, Wang Ruxue, Shi Bofeng, et al. Estimations of confidence intervals for six common similarity indices by numerical simulations[C]//Proc of the 39th Chinese Control Conf (CCC). Piscataway, NJ: IEEE, 2020: 6129−6134
|
[27] |
Hu Yuting, Zheng Liang, Yang Yi, et al. Twitter100k: A real-world dataset for weakly supervised cross-media retrieval[J]. IEEE Transactions on Multimedia, 2017, 20(4): 927−938
|
[28] |
Hutto C, Gilbert E. VADER: A parsimonious rule-based model for sentiment analysis of social media text[J]. Proceedings of the International AAAI Conf on Web and Social Media, 2014, 8(1): 216−225
|
[29] |
Borth D, Ji Rongrong, Chen Tao, et al. Large-scale visual sentiment ontology and detectors using adjective noun pairs[C]//Proc of the 21st ACM Int Conf on Multimedia. New York: ACM, 2013: 223−232
|
[30] |
Mikolov T, Chen Kai, Corrado G, et al. Efficient estimation of word representations in vector space[J]. arXiv preprint, arXiv: 1301.3781, 2013
|
[31] |
Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint, arXiv: 1511.07122, 2016
|
[32] |
Bock S, Goppold J, Weiß M. An improvement of the convergence proof of the ADAM-Optimizer[J]. arXiv preprint, arXiv: 1804.10587, 2018
|
[33] |
Wong T T, Yeh P Y. Reliable accuracy estimates from k-fold cross validation[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 32(8): 1586−1594
|
[1] | Chen Guilin, Wang Guanwu, Wang Kang, Hu Minhui, Deng Junquan. KCNN: A Neural Network Lightweight Method and Hardware Implementation Architecture[J]. Journal of Computer Research and Development, 2025, 62(2): 532-541. DOI: 10.7544/issn1000-1239.202330409 |
[2] | Cao Meichun, Zhang Wenying, Chen Yanqin, Xing Zhaohui, Wu Lei. RAIN: A Lightweight Block Cipher Towards Software, Hardware and Threshold Implementations[J]. Journal of Computer Research and Development, 2021, 58(5): 1045-1055. DOI: 10.7544/issn1000-1239.2021.20200933 |
[3] | Li Wei, Cao Shan, Gu Dawu, Li Jiayao, Wang Menglin, Cai Tianpei, Shi Xiujin. Ciphertext-Only Fault Analysis of the MIBS Lightweight Cryptosystem in the Internet of Things[J]. Journal of Computer Research and Development, 2019, 56(10): 2216-2228. DOI: 10.7544/issn1000-1239.2019.20190406 |
[4] | Zhang Liqing, Guo Dong, Wu Shaoling, Cui Haibo, Wang Wei. An Ultra Lightweight Container that Maximizes Memory Sharing and Minimizes the Runtime Environment[J]. Journal of Computer Research and Development, 2019, 56(7): 1545-1555. DOI: 10.7544/issn1000-1239.2019.20180511 |
[5] | Liu Botao, Peng Changgen, Wu Ruixue, Ding Hongfa, Xie Mingming. Lightweight Format-Preserving Encryption Algorithm Oriented to Number[J]. Journal of Computer Research and Development, 2019, 56(7): 1488-1497. DOI: 10.7544/issn1000-1239.2019.20180745 |
[6] | Li Wei, Wu Yixin, Gu Dawu, Cao Shan, Liao Linfeng, Sun Li, Liu Ya, Liu Zhiqiang. Ciphertext-Only Fault Analysis of the LBlock Lightweight Cipher[J]. Journal of Computer Research and Development, 2018, 55(10): 2174-2184. DOI: 10.7544/issn1000-1239.2018.20180437 |
[7] | Xu Zhiwei, Zhang Yujun. Efficient Detection of False Data Fusion in IoT[J]. Journal of Computer Research and Development, 2018, 55(7): 1488-1497. DOI: 10.7544/issn1000-1239.2018.20180123 |
[8] | Wang Yue, Fan Kai. Ultra-Lightweight RFID Electronic Ticket Authentication Scheme in IoT[J]. Journal of Computer Research and Development, 2018, 55(7): 1432-1439. DOI: 10.7544/issn1000-1239.2018.20180075 |
[9] | Li Wei, Ge Chenyu, Gu Dawu, Liao Linfeng, Gao Zhiyong, Guo Zheng, Liu Ya, Liu Zhiqiang, Shi Xiujin. Research on the LED Lightweight Cipher Against the Statistical Fault Analysis in Internet of Things[J]. Journal of Computer Research and Development, 2017, 54(10): 2205-2214. DOI: 10.7544/issn1000-1239.2017.20170437 |
[10] | Jin Yongming, Wu Qiying, Shi Zhiqiang, Lu Xiang, Sun Limin. RFID Lightweight Authentication Protocol Based on PRF[J]. Journal of Computer Research and Development, 2014, 51(7): 1506-1514. |