• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Ma Xinyu, Fan Yixing, Guo Jiafeng, Zhang Ruqing, Su Lixin, Cheng Xueqi. An Empirical Investigation of Generalization and Transfer in Short Text Matching[J]. Journal of Computer Research and Development, 2022, 59(1): 118-126. DOI: 10.7544/issn1000-1239.20200626
Citation: Ma Xinyu, Fan Yixing, Guo Jiafeng, Zhang Ruqing, Su Lixin, Cheng Xueqi. An Empirical Investigation of Generalization and Transfer in Short Text Matching[J]. Journal of Computer Research and Development, 2022, 59(1): 118-126. DOI: 10.7544/issn1000-1239.20200626

An Empirical Investigation of Generalization and Transfer in Short Text Matching

Funds: This work was supported by the National Natural Science Foundation of China (61722211, 61773362, 61872338, 62006218, 61902381), the National Key Research and Development Program of China (2016QY02D0405), the Project of Beijing Academy of Artificial Intelligence (BAAI2019ZD0306), the Youth Innovation Promotion Association CAS (20144310, 2016102), the Project of Chongqing Research Program of Basic Research and Frontier Technology (cstc2017jcyjBX0059), the K.C.Wong Education Foundation, and the Lenovo-CAS Joint Lab Youth Scientist Project.
More Information
  • Published Date: December 31, 2021
  • Many tasks in natural language understanding, such as natural language inference, question answering, and paraphrasing can be viewed as short text matching problems. Recently, the emergence of a large number of datasets and deep learning models has made great success in short text matching. However, little study has been done on analyzing the generalization of these datasets across different text matching tasks, and how to leverage these supervised datasets of multiple domains to new domains to reduce the cost of annotating and improve their performance. In this paper, we conduct an extensive investigation of generalization and transfer across different datasets and show the factors that affect the generalization through visualization. Specially, we experiment with a conventional neural semantic matching model ESIM (enhanced sequential inference model) and a pre-trained language model BERT (bidirectional encoder representations from transformers) over 10 common datasets. We show that even BERT which is pre-trained on a large-scale dataset can still improve performance on the target dataset through transfer learning. Following our analysis, we also demonstrate that pre-training on multiple datasets shows good generalization and transfer. In the case of a new domain and few-shot setting, BERT which we pre-train on the multiple datasets first and then transfers to new datasets achieves exciting performance.
  • Related Articles

    [1]Guo Husheng, Zhang Yutong, Wang Wenjian. Elastic Gradient Ensemble for Concept Drift Adaptation[J]. Journal of Computer Research and Development, 2025, 62(5): 1235-1247. DOI: 10.7544/issn1000-1239.202440407
    [2]Guo Husheng, Zhang Yang, Wang Wenjian. Two-Stage Adaptive Ensemble Learning Method for Different Types of Concept Drift[J]. Journal of Computer Research and Development, 2024, 61(7): 1799-1811. DOI: 10.7544/issn1000-1239.202330452
    [3]Guo Husheng, Liu Yanjie, Wang Wenjian. Concept Drift Processing Method of Streaming Data Based on Mixed Feature Extraction[J]. Journal of Computer Research and Development, 2024, 61(6): 1497-1510. DOI: 10.7544/issn1000-1239.202330184
    [4]Guo Husheng, Sun Ni, Wang Jiahao, Wang Wenjian. Concept Drift Convergence Method Based on Adaptive Deep Ensemble Networks[J]. Journal of Computer Research and Development, 2024, 61(1): 172-183. DOI: 10.7544/issn1000-1239.202220835
    [5]Guo Husheng, Cong Lu, Gao Shuhua, Wang Wenjian. Adaptive Classification Method for Concept Drift Based on Online Ensemble[J]. Journal of Computer Research and Development, 2023, 60(7): 1592-1602. DOI: 10.7544/issn1000-1239.202220245
    [6]Cai Huan, Lu Kezhong, Wu Qirong, Wu Dingming. Adaptive Classification Algorithm for Concept Drift Data Stream[J]. Journal of Computer Research and Development, 2022, 59(3): 633-646. DOI: 10.7544/issn1000-1239.20201017
    [7]Cheng Guang, Qian Dexin, Guo Jianwei, Shi Haibin, Hua, Zhao Yuyu. A Classification Approach Based on Divergence for Network Traffic in Presence of Concept Drift[J]. Journal of Computer Research and Development, 2020, 57(12): 2673-2682. DOI: 10.7544/issn1000-1239.2020.20190691
    [8]Deng Dayong, Xu Xiaoyu, Huang Houkuan. Concept Drifting Detection for Categorical Evolving Data Based on Parallel Reducts[J]. Journal of Computer Research and Development, 2015, 52(5): 1071-1079. DOI: 10.7544/issn1000-1239.2015.20140275
    [9]Guo Gongde, Li Nan, and Chen Lifei. Concept Drift Detection for Data Streams Based on Mixture Model[J]. Journal of Computer Research and Development, 2014, 51(4): 731-742.
    [10]Xin Yi, Guo Gongde, Chen Lifei, Bi Yaxin. IKnnM-DHecoc: A Method for Handling the Problem of Concept Drift[J]. Journal of Computer Research and Development, 2011, 48(4): 592-601.
  • Cited by

    Periodical cited type(7)

    1. 朱思峰,王钰,陈昊,朱海,柴争义,杨诚瑞. 车联网边缘计算场景下基于改进型NSGA-Ⅱ算法的边缘服务器部署决策. 物联网学报. 2024(01): 84-97 .
    2. 门红蕾,曹利,郑国莉,李原帅,马海英. 车联网基于稀疏用户环境的LBS隐私保护方案. 计算机应用研究. 2024(09): 2831-2838 .
    3. 汪洋,叶挺,李廷文,吴兵. 自主船舶航行系统信息空间安全:挑战与探索. 华中科技大学学报(自然科学版). 2023(02): 64-76 .
    4. 郑莹莹,周俊龙,申钰凡,丛佩金,吴泽彬. 时间和能量敏感的端——边—云车路协同系统资源调度优化方法. 计算机研究与发展. 2023(05): 1037-1052 . 本站查看
    5. 况博裕,李雨泽,顾芳铭,苏铓,付安民. 车联网安全研究综述:威胁、对策与未来展望. 计算机研究与发展. 2023(10): 2304-2321 . 本站查看
    6. 王晨,郑文英,王惟正,谭皓文. 边缘计算数据安全保护研究综述. 网络空间安全科学学报. 2023(02): 35-45 .
    7. 邓雨康,张磊,李晶. 车联网隐私保护研究综述. 计算机应用研究. 2022(10): 2891-2906 .

    Other cited types(9)

Catalog

    Article views (573) PDF downloads (344) Cited by(16)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return