Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model

Zhang Naizhou; Cao Wei; Zhang Xiaojian; Li Shijun

doi:10.7544/issn1000-1239.202440551

Journal of Computer Research and Development > 2024 > Accepted Manuscript > DOI: 10.7544/issn1000-1239.202440551

Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551

Citation:

PDF (1166 KB)

Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model

1.
College of Computer and Information Engineering, Henan University of Economics and Law, Zhengzhou 450046
2.
School of Computer Science, Wuhan University, Wuhan 430072

Funds: This work was supported by the National Natural Science Foundation of China (31700858) and the Key Technologies Research and Development Program of Henan Province (242102210076).

More Information

Author Bio:
Zhang Naizhou: born in 1970. PhD，associate professor. Member of CCF. His main research interests include Web search, data mining, and machine learning

Cao Wei: born in 1977. Master, associate professor. Her main research interests include machine learning, and web data mining

Zhang Xiaojian: born in 1980. PhD, professor, master supervisor. His main research interests include differential privacy, privacy analysis in language model, and graph data management

Li Shijun: born in 1964. PhD, professor, PhD supervisor. His main research interests include database theory, Web search engine and Web mining
Received Date: June 19, 2024
Revised Date: October 21, 2024
Accepted Date: November 11, 2024
Available Online: November 14, 2024

Graphical Abstract

Abstract

Abstract

Research on knowledge-grounded dialogue often suffers from the problem of external knowledge containing redundant or even noisy information irrelevant to the conversation topic, which leads to a degradation in the performance of the dialogue system. Knowledge selection becomes an important approach to solving this issue. However, existing work has not yet investigated in depth some issues involving it such as how to design a knowledge selector, how to exploit the selected knowledge, what are the suitable scenarios for the knowledge selection conversation methods, etc. In this paper, we propose a new neural conversation method based on conditional variational attention knowledge selection and a pre-trained language model. This method employs a knowledge selection algorithm based on CVAE and a multi-layer attention mechanism to pick up the most relevant textual knowledge collection to the current conversation, which effectively exploits the dialogue response in training data to improve the efficiency of knowledge selection. Our novel model adopts the pre-trained language model Bart as encoder-decoder architecture and incorporates selected textual knowledge into the Bart model to fine-tune it during the training process. The experimental results show that the model proposed, in contrast to the current representative dialog models, can generate more diverse and coherent dialogue responses with higher accuracy.
- knowledge-grounded dialogue,
- knowledge selection,
- pre-trained language model,
- CVAE,
- attention mechanism,
- memory network

FullText(HTML)

References (29)

References

[1]	Ni Jinjie, Young T, Pandelea V, et al. Recent advances in deep learning based dialogue systems: A systematic survey[J]. Artificial Intelligence Review, 2023, 56(4): 3055−3155 doi: 10.1007/s10462-022-10248-8
[2]	Serban I V, Sordoni A, Bengio Y, et al. Building end-to-end dialogue systems using generative hierarchical neural network models [C] //Proc of the 30th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2016: 3776−3784
[3]	Serban I V, Sordoni A, Lowe R, et al. A hierarchical latent variable encoder-decoder model for generating dialogues [C] //Proc of the 31st AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2017: 3295–3301
[4]	Li Jiwei, Galley M, Brocket C, et al. A diversity-promoting objective function for neural conversation models [C] //Proc of the 2016 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2016: 110−119
[5]	Zhao Tiancheng, Zhao Ran, Eskénazi M. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders [C] //Proc of the 55th Conf of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2017: 654−664
[6]	Zhang Rongjunchen, Wu Tingmin, Chen Xiao, et al. A transformer-based dialogue system with dynamic attention [C] //Proc of the Web Conf 2023. New York: ACM, 2023: 1604−1615
[7]	Golovanov S, Kurbanov R, Sergey I N, et al. Large-scale transfer learning for natural language generation [C] //Proc of the 57th Conf of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 6053−6058
[8]	Ghazvininejad M, Brockett C, Chang Mingwei, et al. A knowledge-grounded neural conversation model [C] //Proc of the 32nd AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2018: 5110−5117
[9]	Li Zekang, Niu Cheng, Meng Fandong, et al. Incremental transformer with deliberation decoder for document grounded conversations [C] //Proc of the 57th Conf of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 12−21
[10]	Liu Shilei, Zhao Xiaofeng, Li Bochao, et al. A three-stage learning framework for low-resource knowledge-grounded dialogue generation [C] //Proc of the 2021 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2021: 2262−2272
[11]	Dinan E, Roller S, Shuster K, et al. Wizard of wikipedia: Knowledge-powered conversational agents [C/OL] //Proc of the 7th Int Conf on Learning Representations. Ithaca, NY: Cornell University, 2019[2024-09-25]. https://openreview.net/forum?id=HtLWEQ69sf9
[12]	Lian Rongzhong, Xie Min, Wang Fan, et al. Learning to select knowledge for response generation in dialog systems [C] //Proc of the 28th Int Joint Conf on Artificial Intelligence. San Francisco, CA: Morgan Kaufmann, 2019: 5081−5087
[13]	Kim B, Ahn J, Kim G. Sequential latent knowledge selection for knowledge-grounded dialogue [C/OL] //Proc of the 8th Int Conf on Learning Representations. Ithaca, NY: Cornell University, 2020[2024-09-25]. https://openreview.net/forum?id=pm6GlJQp4Oo
[14]	Chen Xiuyi, Meng Fandong, Li Peng, et al. Bridging the gap between prior and posterior knowledge selection for knowledge-grounded dialogue generation [C] //Proc of the 2020 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 3426−3437
[15]	Chen Xiuyi, Chen Feilong, Meng Fandong, et al. Unsupervised knowledge selection for dialogue generation [C] //Proc of the Joint Conf of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int Joint Conf on Natural Language Processing: Findings. Stroudsburg, PA: ACL, 2021: 1230−1244
[16]	Liu Zhibin, Niu Zhengyu, Wu Hua, et al. Knowledge aware conversation generation with explainable reasoning over augmented graphs [C] //Proc of the 2019 Conf on Empirical Methods in Natural Language Processing and the 9th Int Joint Conf on Natural Language Processing. Stroudsburg, PA: ACL, 2019: 1782−1792
[17]	Meng Chuan, Ren Pengjie, Chen Zhumin, et al. Initiative-aware self-supervised learning for knowledge-grounded conversations [C] //Proc of the 44th Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 2021: 522−532
[18]	Zhao Xueliang, Wu Wei, Xu Can, et al. Knowledge-grounded dialogue generation with pre-trained language models [C] //Proc of the 2020 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 115−125
[19]	Zheng Chujie, Cao Yunbo, Jiang Daxin, et al. Difference-aware knowledge selection for knowledge-grounded conversation generation [C] //Proc of the 2020 Conf on Empirical Methods in Natural Language Processing: Findings. Stroudsburg, PA: ACL, 2020: 115−125
[20]	Sun Weiwei, Ren Pengjie, Ren Zhaochun. Generative knowledge selection for knowledge-grounded dialogues [C] //Proc of the 17th Conf of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2023: 2032−2043
[21]	孙润鑫,马龙轩,张伟男,等. 基于文档的对话研究[J]. 计算机研究与发展,2021,58(9):1915−1924 doi: 10.7544/issn1000-1239.2021.20200634 Sun Runxin, Ma Longxuan, Zhang Weinan, et al. Research on document grounded conversations[J]. Journal of Computer Research and Development, 2021, 58(9): 1915−1924 (in Chinese) doi: 10.7544/issn1000-1239.2021.20200634
[22]	Yang Yizhe, Huang Heyan, Liu Yuhang, et al. Graph vs. sequence: An empirical study on knowledge forms for knowledge-grounded dialogue [C] //Proc of the 2023 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2023: 15846−15858
[23]	Sukhbaatar S, Szlam A, Weston J, et al. End-to-end memory networks [C] //Proc of the 28th Conf on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2015: 2440−2448
[24]	Lewis M, Liu Yinhan, Goyal N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension [C] //Proc of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 7871−7880
[25]	Zhou Kangyan, Prabhumoye S, Black A W. A dataset for document grounded conversations [C] //Proc of the 2018 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2018: 708−713
[26]	Papineni K, Roukos S, Ward T, et al. Bleu: A method for automatic evaluation of machine translation [C] //Proc of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2002: 311−318
[27]	Liu Pengfei, Yuan Weizhe, Fu Jinlan, et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing [J]. ACM Computing Surveys, 2023, 55(9): 195: 1−195: 35
[28]	Zhang Tianyi, Kishore V, Wu F, et al. BERTScore: Evaluating text generation with BERT [C/OL] //Proc of the 8th Int Conf on Learning Representations. Ithaca, NY: Cornell University, 2020[2024-09-25]. https://openreview.net/forum?id=SkeHuCVFDr
[29]	Liu Chiawei, Lowe R, Serban I, et al. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation [C] //Proc of the 2016 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2016: 2122−2132

Cited By

Cited by

Periodical cited type(9)

1.	陈彩华，佘程熙，王庆阳. 可信机器学习综述. 工业工程. 2024(02): 14-26 .
2.	饶高琦，周立炜. 论语言智能的治理. 语言战略研究. 2024(03): 38-48 .
3.	穆春阳，李闯，马行，刘永鹿，杨科，刘宝成. 改进YOLOv7-tiny的轻量化大型铸件焊缝缺陷检测. 组合机床与自动化加工技术. 2024(07): 156-160 .
4.	喻继军，熊明华. 电子商务推荐系统公平性研究进展. 现代信息科技. 2023(14): 115-124 .
5.	范卓娅，孟小峰. 算法公平与公平计算. 计算机研究与发展. 2023(09): 2048-2066 . 本站查看
6.	吴雷，杜文研，林超然. 基于专利数据应用LDA和N-BEATS组合方法的技术主题预测研究. 数字图书馆论坛. 2023(11): 62-73 .
7.	古天龙，李龙，常亮，罗义琴. 公平机器学习:概念、分析与设计. 计算机学报. 2022(05): 1018-1051 .
8.	王文鑫，张健毅. 联邦学习公平性研究综述. 北京电子科技学院学报. 2022(02): 122-134 .
9.	郁建兴，刘宇轩. 社会治理中的深度学习算法公平性. 信息技术与管理应用. 2022(01): 17-27 .