• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551
Citation: Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551

Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model

Funds: This work was supported by the National Natural Science Foundation of China (31700858) and the Key Technologies Research and Development Program of Henan Province (242102210076).
More Information
  • Author Bio:

    Zhang Naizhou: born in 1970. PhD,associate professor. Member of CCF. His main research interests include Web search, data mining, and machine learning

    Cao Wei: born in 1977. Master, associate professor. Her main research interests include machine learning, and web data mining

    Zhang Xiaojian: born in 1980. PhD, professor, master supervisor. His main research interests include differential privacy, privacy analysis in language model, and graph data management

    Li Shijun: born in 1964. PhD, professor, PhD supervisor. His main research interests include database theory, Web search engine and Web mining

  • Received Date: June 19, 2024
  • Revised Date: October 21, 2024
  • Accepted Date: November 11, 2024
  • Available Online: November 14, 2024
  • Research on knowledge-grounded dialogue often suffers from the problem of external knowledge containing redundant or even noisy information irrelevant to the conversation topic, which leads to a degradation in the performance of the dialogue system. Knowledge selection becomes an important approach to solving this issue. However, existing work has not yet investigated in depth some issues involving it such as how to design a knowledge selector, how to exploit the selected knowledge, what are the suitable scenarios for the knowledge selection conversation methods, etc. In this paper, we propose a new neural conversation method based on conditional variational attention knowledge selection and a pre-trained language model. This method employs a knowledge selection algorithm based on CVAE and a multi-layer attention mechanism to pick up the most relevant textual knowledge collection to the current conversation, which effectively exploits the dialogue response in training data to improve the efficiency of knowledge selection. Our novel model adopts the pre-trained language model Bart as encoder-decoder architecture and incorporates selected textual knowledge into the Bart model to fine-tune it during the training process. The experimental results show that the model proposed, in contrast to the current representative dialog models, can generate more diverse and coherent dialogue responses with higher accuracy.

  • [1]
    Ni Jinjie, Young T, Pandelea V, et al. Recent advances in deep learning based dialogue systems: A systematic survey[J]. Artificial Intelligence Review, 2023, 56(4): 3055−3155 doi: 10.1007/s10462-022-10248-8
    [2]
    Serban I V, Sordoni A, Bengio Y, et al. Building end-to-end dialogue systems using generative hierarchical neural network models [C] //Proc of the 30th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2016: 3776−3784
    [3]
    Serban I V, Sordoni A, Lowe R, et al. A hierarchical latent variable encoder-decoder model for generating dialogues [C] //Proc of the 31st AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2017: 3295–3301
    [4]
    Li Jiwei, Galley M, Brocket C, et al. A diversity-promoting objective function for neural conversation models [C] //Proc of the 2016 Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2016: 110−119
    [5]
    Zhao Tiancheng, Zhao Ran, Eskénazi M. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders [C] //Proc of the 55th Conf of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2017: 654−664
    [6]
    Zhang Rongjunchen, Wu Tingmin, Chen Xiao, et al. A transformer-based dialogue system with dynamic attention [C] //Proc of the Web Conf 2023. New York: ACM, 2023: 1604−1615
    [7]
    Golovanov S, Kurbanov R, Sergey I N, et al. Large-scale transfer learning for natural language generation [C] //Proc of the 57th Conf of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 6053−6058
    [8]
    Ghazvininejad M, Brockett C, Chang Mingwei, et al. A knowledge-grounded neural conversation model [C] //Proc of the 32nd AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2018: 5110−5117
    [9]
    Li Zekang, Niu Cheng, Meng Fandong, et al. Incremental transformer with deliberation decoder for document grounded conversations [C] //Proc of the 57th Conf of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 12−21
    [10]
    Liu Shilei, Zhao Xiaofeng, Li Bochao, et al. A three-stage learning framework for low-resource knowledge-grounded dialogue generation [C] //Proc of the 2021 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2021: 2262−2272
    [11]
    Dinan E, Roller S, Shuster K, et al. Wizard of wikipedia: Knowledge-powered conversational agents [C/OL] //Proc of the 7th Int Conf on Learning Representations. Ithaca, NY: Cornell University, 2019[2024-09-25]. https://openreview.net/forum?id=HtLWEQ69sf9
    [12]
    Lian Rongzhong, Xie Min, Wang Fan, et al. Learning to select knowledge for response generation in dialog systems [C] //Proc of the 28th Int Joint Conf on Artificial Intelligence. San Francisco, CA: Morgan Kaufmann, 2019: 5081−5087
    [13]
    Kim B, Ahn J, Kim G. Sequential latent knowledge selection for knowledge-grounded dialogue [C/OL] //Proc of the 8th Int Conf on Learning Representations. Ithaca, NY: Cornell University, 2020[2024-09-25]. https://openreview.net/forum?id=pm6GlJQp4Oo
    [14]
    Chen Xiuyi, Meng Fandong, Li Peng, et al. Bridging the gap between prior and posterior knowledge selection for knowledge-grounded dialogue generation [C] //Proc of the 2020 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 3426−3437
    [15]
    Chen Xiuyi, Chen Feilong, Meng Fandong, et al. Unsupervised knowledge selection for dialogue generation [C] //Proc of the Joint Conf of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th Int Joint Conf on Natural Language Processing: Findings. Stroudsburg, PA: ACL, 2021: 1230−1244
    [16]
    Liu Zhibin, Niu Zhengyu, Wu Hua, et al. Knowledge aware conversation generation with explainable reasoning over augmented graphs [C] //Proc of the 2019 Conf on Empirical Methods in Natural Language Processing and the 9th Int Joint Conf on Natural Language Processing. Stroudsburg, PA: ACL, 2019: 1782−1792
    [17]
    Meng Chuan, Ren Pengjie, Chen Zhumin, et al. Initiative-aware self-supervised learning for knowledge-grounded conversations [C] //Proc of the 44th Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 2021: 522−532
    [18]
    Zhao Xueliang, Wu Wei, Xu Can, et al. Knowledge-grounded dialogue generation with pre-trained language models [C] //Proc of the 2020 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2020: 115−125
    [19]
    Zheng Chujie, Cao Yunbo, Jiang Daxin, et al. Difference-aware knowledge selection for knowledge-grounded conversation generation [C] //Proc of the 2020 Conf on Empirical Methods in Natural Language Processing: Findings. Stroudsburg, PA: ACL, 2020: 115−125
    [20]
    Sun Weiwei, Ren Pengjie, Ren Zhaochun. Generative knowledge selection for knowledge-grounded dialogues [C] //Proc of the 17th Conf of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2023: 2032−2043
    [21]
    孙润鑫,马龙轩,张伟男,等. 基于文档的对话研究[J]. 计算机研究与发展,2021,58(9):1915−1924 doi: 10.7544/issn1000-1239.2021.20200634

    Sun Runxin, Ma Longxuan, Zhang Weinan, et al. Research on document grounded conversations[J]. Journal of Computer Research and Development, 2021, 58(9): 1915−1924 (in Chinese) doi: 10.7544/issn1000-1239.2021.20200634
    [22]
    Yang Yizhe, Huang Heyan, Liu Yuhang, et al. Graph vs. sequence: An empirical study on knowledge forms for knowledge-grounded dialogue [C] //Proc of the 2023 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2023: 15846−15858
    [23]
    Sukhbaatar S, Szlam A, Weston J, et al. End-to-end memory networks [C] //Proc of the 28th Conf on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2015: 2440−2448
    [24]
    Lewis M, Liu Yinhan, Goyal N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension [C] //Proc of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 7871−7880
    [25]
    Zhou Kangyan, Prabhumoye S, Black A W. A dataset for document grounded conversations [C] //Proc of the 2018 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2018: 708−713
    [26]
    Papineni K, Roukos S, Ward T, et al. Bleu: A method for automatic evaluation of machine translation [C] //Proc of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2002: 311−318
    [27]
    Liu Pengfei, Yuan Weizhe, Fu Jinlan, et al. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing [J]. ACM Computing Surveys, 2023, 55(9): 195: 1−195: 35
    [28]
    Zhang Tianyi, Kishore V, Wu F, et al. BERTScore: Evaluating text generation with BERT [C/OL] //Proc of the 8th Int Conf on Learning Representations. Ithaca, NY: Cornell University, 2020[2024-09-25]. https://openreview.net/forum?id=SkeHuCVFDr
    [29]
    Liu Chiawei, Lowe R, Serban I, et al. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation [C] //Proc of the 2016 Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2016: 2122−2132
  • Related Articles

    [1]Qian Zhongsheng, Huang Heng, Zhu Hui, Liu Jinping. Multi-Perspective Graph Contrastive Learning Recommendation Method with Layer Attention Mechanism[J]. Journal of Computer Research and Development, 2025, 62(1): 160-178. DOI: 10.7544/issn1000-1239.202330804
    [2]Sun Ying, Ding Weiping, Huang Jiashuang, Ju Hengrong, Li Ming, Geng Yu. RCAR-UNet:Retinal Vessels Segmentation Network Based on Rough Channel Attention Mechanism[J]. Journal of Computer Research and Development, 2023, 60(4): 947-961. DOI: 10.7544/issn1000-1239.202110735
    [3]Zhang Xiaoyu, Li Dongdong, Ren Pengjie, Chen Zhumin, Ma Jun, Ren Zhaochun. Memory Networks Based Knowledge-Aware Medical Dialogue Generation[J]. Journal of Computer Research and Development, 2022, 59(12): 2889-2900. DOI: 10.7544/issn1000-1239.20210851
    [4]Xie Jun, Wang Yuzhu, Chen Bo, Zhang Zehua, Liu Qin. Aspect-Based Sentiment Analysis Model with Bi-Guide Attention Network[J]. Journal of Computer Research and Development, 2022, 59(12): 2831-2843. DOI: 10.7544/issn1000-1239.20210708
    [5]Wang Honglin, Yang Dan, Nie Tiezheng, Kou Yue. Attributed Heterogeneous Information Network Embedding with Self-Attention Mechanism for Product Recommendation[J]. Journal of Computer Research and Development, 2022, 59(7): 1509-1521. DOI: 10.7544/issn1000-1239.20210016
    [6]Wang Chenglong, Yi Jiangyan, Tao Jianhua, Ma Haoxin, Tian Zhengkun, Fu Ruibo. Global and Temporal-Frequency Attention Based Network in Audio Deepfake Detection[J]. Journal of Computer Research and Development, 2021, 58(7): 1466-1475. DOI: 10.7544/issn1000-1239.2021.20200799
    [7]Cheng Yan, Yao Leibo, Zhang Guanghe, Tang Tianwei, Xiang Guoxiong, Chen Haomai, Feng Yue, Cai Zhuang. Text Sentiment Orientation Analysis of Multi-Channels CNN and BiGRU Based on Attention Mechanism[J]. Journal of Computer Research and Development, 2020, 57(12): 2583-2595. DOI: 10.7544/issn1000-1239.2020.20190854
    [8]Chen Yanmin, Wang Hao, Ma Jianhui, Du Dongfang, Zhao Hongke. A Hierarchical Attention Mechanism Framework for Internet Credit Evaluation[J]. Journal of Computer Research and Development, 2020, 57(8): 1755-1768. DOI: 10.7544/issn1000-1239.2020.20200217
    [9]Zhang Yixuan, Guo Bin, Liu Jiaqi, Ouyang Yi, Yu Zhiwen. app Popularity Prediction with Multi-Level Attention Networks[J]. Journal of Computer Research and Development, 2020, 57(5): 984-995. DOI: 10.7544/issn1000-1239.2020.20190672
    [10]Zhang Zhichang, Zhang Zhenwen, Zhang Zhiman. User Intent Classification Based on IndRNN-Attention[J]. Journal of Computer Research and Development, 2019, 56(7): 1517-1524. DOI: 10.7544/issn1000-1239.2019.20180648

Catalog

    Article views (29) PDF downloads (11) Cited by()

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return