Semi-Supervised Method for Cross-Lingual Word Embedding Based on an Adversarial Model with Double Discriminators

Zhang Yuhong; Zhi Wenwu; Li Peipei; Hu Xuegang

doi:10.7544/issn1000-1239.202220036

Journal of Computer Research and Development > 2023 > 60(9): 2127-2136. > DOI: 10.7544/issn1000-1239.202220036

Zhang Yuhong, Zhi Wenwu, Li Peipei, Hu Xuegang. Semi-Supervised Method for Cross-Lingual Word Embedding Based on an Adversarial Model with Double Discriminators[J]. Journal of Computer Research and Development, 2023, 60(9): 2127-2136. DOI: 10.7544/issn1000-1239.202220036

Citation:

PDF (1095 KB)

Semi-Supervised Method for Cross-Lingual Word Embedding Based on an Adversarial Model with Double Discriminators

Key Laboratory of Knowledge Engineering with Big Data(Hefei University of Technology), Ministry of Education, Hefei 230009
School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601

Funds: This work was supported by the National Key Research and Development Program of China (2020AAA0106100), the National Natural Science Foundation of China (62076087, 61976077), and the Anhui Provincial Natural Science Foundation (2208085MF170).

More Information

Author Bio:
Zhang Yuhong: born in 1979. PhD，associate professor. Member of CCF. Her main research interests include transfer learning and machine learning

Zhi Wenwu: born in 1995. Master. His main research interests include cross-lingual word embedding and machine learning. (1016822185@qq.com)

Li Peipei: born in 1982. PhD，associate professor. Her main research interests include data stream mining and machine learning. (peipeili@hfut.edu.cn)

Hu Xuegang: born in 1961. PhD, professor, PhD supervisor. His main research interests include data mining, Web mining, and machine learning
Received Date: January 08, 2022
Revised Date: November 03, 2022
Available Online: June 26, 2023

Graphical Abstract

Abstract

Abstract

Cross-lingual word embedding aims to use the embedding space of resource-rich languages to improve the embedding of resource-scare languages, and it is widely used in a variety of cross-lingual tasks. Most of the existing methods address the word alignment by learning a linear mapping between two embedding spaces. Among them, the adversarial model based methods have received widespread attention because they can obtain good performance without using any dictionary. However, these methods perform not well on the dissimilar language pairs. The reason may be that the mapping learning only relies on the distance measurement for the entire space without the guidance of the seed dictionary, which results in multiple possibilities for the aligned word pairs and unsatisfying alignment. Therefore, in this paper, a semi-supervised cross-lingual word embedding method based on an adversarial model with dual discriminators is proposed. Based on the existing adversarial model, a bi-directional shared and fine-grained discriminator is added, and then an adversarial model with double discriminators is constructed. In addition, a negative sample dictionary is introduced as a supplement of the supervised seed dictionary to guild the fine-grained discriminator in a semi-supervised way. By minimizing the distance between the initial word-pairs and the supervised dictionary, including the seed dictionary and negative dictionary, the fine-grained discriminator will reduce the possibility of multiple word pairs and recognize the correct aligned pairs from those initial generated dictionaries. Finally, experimental results conducted on two cross-lingual datasets show that our proposed method can effectively improve the performance of the cross-lingual word embedding.
- cross-lingual,
- word embedding,
- adversarial training,
- dual discriminators,
- semi-supervised

FullText(HTML)

References (25)

References

[1]	吴宗友,白昆龙,杨林蕊,等. 电子病历文本挖掘研究综述[J]. 计算机研究与发展,2021,58(3):513−527 doi: 10.7544/issn1000-1239.2021.20200402 Wu Zongyou, Bai Kunlong, Yang Linrui, et al. Review on text mining of electronic medical record[J]. Journal of Computer Research and Development, 2021, 58(3): 513−527 (in Chinese) doi: 10.7544/issn1000-1239.2021.20200402
[2]	Artetxe M, Labaka G, Agirre E. Learning bilingual word embeddings with (almost) no bilingual data[C]// Proc of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 451−462
[3]	Mikolov T, Le Q V, Sutskever I. Exploiting similarities among languages for machine translation[J]. arXiv preprint, arXiv: 1309.4168, 2013
[4]	彭晓娅,周栋. 跨语言词向量研究综述[J]. 中文信息学报,2020,34(2):1−15 doi: 10.3969/j.issn.1003-0077.2020.02.001 Peng Xiaoya, Zhou Dong. Survey of cross-lingual word embedding[J]. Journal of Chinese Information Processing, 2020, 34(2): 1−15 (in Chinese) doi: 10.3969/j.issn.1003-0077.2020.02.001
[5]	Zhang Yuan, Gaddy D, Barzilay R, et al. Ten pairs to tag–multilingual POS tagging via coarse mapping between embeddings[C]//Proc of the 2016 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2016: 1307−1317
[6]	Litschko R, Glavaš G, Ponzetto S P, et al. Unsupervised cross-lingual information retrieval using monolingual data only[C]//Proc of the 41st Int ACM SIGIR Conf on Research & Development in Information Retrieval. New York: ACM, 2018: 1253−1256
[7]	Tsai C T, Roth D. Cross-lingual Wikification using multilingual embeddings[C]//Proc of the 2016 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2016: 589−598
[8]	Marie B, Fujita A. Unsupervised joint training of bilingual word embeddings[C]//Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 3224−3230
[9]	Søgaard A, Ruder S, Vulić I. On the limitations of unsupervised bilingual dictionary induction[C]//Proc of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018: 778−788
[10]	Miceli-Barone A V. Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders[C]//Proc of the 1st Workshop on Representation Learning for NLP. Stroudsburg, PA: ACL, 2016: 121−126
[11]	Zhang Meng, Liu Yang, Luan Huanbo, et al. Adversarial training for unsupervised bilingual lexicon induction[C] //Proc of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 1959−1970
[12]	Lample G, Conneau A, Ranzato M A, et al. Word translation without parallel data[C/OL]//Proc of the 6th Int Conf on Learning Representations. 2018[2020-09-20].https://openreview.net/forum?id=H196sainb
[13]	Vulić I, Glavaš G, Reichart R, et al. Do we really need fully unsupervised cross-lingual embeddings? [C]//Proc of the 2019 Conf on Empirical Methods in Natural Language Processing and the 9th Int Joint Conf on Natural Language Processing (EMNLP-IJCNLP). Stroudsburg, PA: ACL, 2019: 4407−4418
[14]	Li Yuling, Yu Kui, Zhang Yuhong. Learning cross-lingual mappings in imperfectly isomorphic embedding spaces[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 2630−2642 doi: 10.1109/TASLP.2021.3097935
[15]	Faruqui M, Dyer C. Improving vector space word representations using multilingual correlation[C]//Proc of the 14th Conf of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2014: 462−471
[16]	Lazaridou A, Dinu G, Baroni M. Hubness and pollution: Delving into cross-space mapping for zero-shot learning[C]//Proc of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Int Joint Conf on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2015: 270−280
[17]	Xing Chao, Wang Dong, Liu Chao, et al. Normalized word embedding and orthogonal transform for bilingual word translation[C]//Proc of the 2015 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2015: 1006−1011
[18]	Joulin A, Bojanowski P, Mikolov T, et al. Loss in translation: Learning bilingual word mapping with a retrieval criterion[C]//Proc of the 2018 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2018: 2979−2984
[19]	Hoshen Y, Wolf L. An iterative closest point method for unsupervised word translation[J]. arXiv preprint, arXiv: 1801.06126, 2018
[20]	Artetxe M, Labaka G, Agirre E. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings[C]// Proc of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018: 789−798
[21]	Alvarez-Melis D, Jaakkola T. Gromov-Wasserstein alignment of word embedding spaces[C]//Proc of the 2018 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2018: 1881−1890
[22]	Mohiuddin T, Joty S. Unsupervised word translation with adversarial autoencoder[J]. Computational Linguistics, 2020, 46(2): 257−288 doi: 10.1162/coli_a_00374
[23]	Patra B, Moniz J R A, Garg S, et al. Bilingual lexicon induction with semi-supervision in non-isometric embedding spaces[C]//Proc of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 184−193
[24]	Smith S L, Turban D H P, Hamblin S, et al. Offline bilingual word vectors, orthogonal transformations and the inverted softmax[C/OL]//Proc of the 5th Int Conf on Learning Representations. 2017[2021-05-10].https://openreview.net/forum?id=r1Aab85gg
[25]	Artetxe M, Labaka G, Agirre E. Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations[C]//Proc of the 32nd AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2018: 5012−5019

[1]	Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551
[2]	Wang Honglin, Yang Dan, Nie Tiezheng, Kou Yue. Attributed Heterogeneous Information Network Embedding with Self-Attention Mechanism for Product Recommendation[J]. Journal of Computer Research and Development, 2022, 59(7): 1509-1521. DOI: 10.7544/issn1000-1239.20210016
[3]	Cheng Yan, Yao Leibo, Zhang Guanghe, Tang Tianwei, Xiang Guoxiong, Chen Haomai, Feng Yue, Cai Zhuang. Text Sentiment Orientation Analysis of Multi-Channels CNN and BiGRU Based on Attention Mechanism[J]. Journal of Computer Research and Development, 2020, 57(12): 2583-2595. DOI: 10.7544/issn1000-1239.2020.20190854
[4]	Wei Zhenkai, Cheng Meng, Zhou Xiabing, Li Zhifeng, Zou Bowei, Hong Yu, Yao Jianmin. Convolutional Interactive Attention Mechanism for Aspect Extraction[J]. Journal of Computer Research and Development, 2020, 57(11): 2456-2466. DOI: 10.7544/issn1000-1239.2020.20190748
[5]	Chen Yanmin, Wang Hao, Ma Jianhui, Du Dongfang, Zhao Hongke. A Hierarchical Attention Mechanism Framework for Internet Credit Evaluation[J]. Journal of Computer Research and Development, 2020, 57(8): 1755-1768. DOI: 10.7544/issn1000-1239.2020.20200217
[6]	Li Mengying, Wang Xiaodong, Ruan Shulan, Zhang Kun, Liu Qi. Student Performance Prediction Model Based on Two-Way Attention Mechanism[J]. Journal of Computer Research and Development, 2020, 57(8): 1729-1740. DOI: 10.7544/issn1000-1239.2020.20200181
[7]	Zhang Yingying, Qian Shengsheng, Fang Quan, Xu Changsheng. Multi-Modal Knowledge-Aware Attention Network for Question Answering[J]. Journal of Computer Research and Development, 2020, 57(5): 1037-1045. DOI: 10.7544/issn1000-1239.2020.20190474
[8]	Zhang Yixuan, Guo Bin, Liu Jiaqi, Ouyang Yi, Yu Zhiwen. app Popularity Prediction with Multi-Level Attention Networks[J]. Journal of Computer Research and Development, 2020, 57(5): 984-995. DOI: 10.7544/issn1000-1239.2020.20190672
[9]	Liu Ye, Huang Jinxiao, Ma Yutao. An Automatic Method Using Hybrid Neural Networks and Attention Mechanism for Software Bug Triaging[J]. Journal of Computer Research and Development, 2020, 57(3): 461-473. DOI: 10.7544/issn1000-1239.2020.20190606
[10]	Zhang Zhichang, Zhang Zhenwen, Zhang Zhiman. User Intent Classification Based on IndRNN-Attention[J]. Journal of Computer Research and Development, 2019, 56(7): 1517-1524. DOI: 10.7544/issn1000-1239.2019.20180648