• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Bi Fenglin, Zhang Qiming, Zhang Jiarui, Wang Yantong, Chen Yang, Zhang Yanbin, Wang Wei, Zhou Xuan. A Retrieval-Augmented Generation System Based on a Sliding Window Strategy in Large Language Models[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440411
Citation: Bi Fenglin, Zhang Qiming, Zhang Jiarui, Wang Yantong, Chen Yang, Zhang Yanbin, Wang Wei, Zhou Xuan. A Retrieval-Augmented Generation System Based on a Sliding Window Strategy in Large Language Models[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440411

A Retrieval-Augmented Generation System Based on a Sliding Window Strategy in Large Language Models

Funds: This work was supported by the General Program of the National Natural Science Foundation of China (62137001, 62277017, 61977026).
More Information
  • Author Bio:

    Bi Fenglin: born in 1993.PhD candidate, student member of CCF. His main research interests include large language models, open source software supply chain and ecology, and software bots

    Zhang Qiming: born in 1999.Master candidate, student member of CCF. His main research interests include large language model applications and computational pedagogy

    Zhang Jiarui: born in 2000.Master candidate. His main research interests include large language models and computational pedagogy

    Wang Yantong: born in 2000.Master candidate. His main reserach interests include large language models, and open source software ecology

    Chen Yang: born in 1981.PhD, professor, PhD supervisor, senior member of ACM,IEEE and CCF.His main research interests include computer networks,social computing and big data analytics

    Zhang Yanbin: born in 1983.Master, engineer, master supervisor, member of CCF. Her main research interests include computational pedagogy and software engineering

    Wang Wei: born in 1979.PhD, professor, PhD supervisor, member of CCF. His main research interests include open source measurement and digital ecosystems

    Zhou Xuan: born in 1979. PhD, professor, PhD supervisor, member of CCF. His research interests include high performance database and information retrieval

  • Received Date: May 30, 2024
  • Revised Date: February 20, 2025
  • Accepted Date: March 02, 2025
  • Available Online: March 02, 2025
  • Leveraging a sliding window strategy, this study presents an innovative retrieval-augmented generation system aimed at enhancing the factual accuracy and reliability of outputs from large language models (LLMs). By applying a sliding window mechanism during the indexing phase, the project effectively addresses the limitations of fixed context window sizes and static retrieval methods. Three specific sliding window strategies have been proposed to efficiently process and segment texts: Fixed Window Size and Fixed Step Length Split (FFS), Dynamic Window Size and Fixed Step Length Split (DFS), and Dynamic Window Size and Dynamic Step Length Split (DDS). To further enhance retrieval accuracy and relevance, the project employs multiple advanced query techniques, including query expansion and reformulation. Rigorous experimental evaluations were conducted using the state-of-the-art Llama-3 model across multiple diverse datasets, encompassing both general knowledge and domain-specific corpora. Results demonstrated optimal performance with a carefully calibrated block size of 1 024 tokens and a step size of 3, significantly improving the F1 score across various tasks. This configuration highlighted the critical importance of balancing document segment length and sliding window step size to maximize information retention and retrieval efficacy.The sliding window strategy effectively preserves contextual information, reduces information loss, and exhibits adaptability across different datasets and query types.

  • [1]
    Zhao W X, Zhou Kun, Li Junyi, et al. A survey of large language models[J]. arXiv preprint, arXiv: 2303.18223, 2023
    [2]
    Taffa A T, Usbeck R. Leveraging LLMs in scholarly knowledge graph question answering[J]. arXiv preprint, arXiv: 2311.09841, 2023
    [3]
    Wang Chaojie, Xu Yishi, Peng Zhong, et al. Keqing: Knowledge-based question answering is a nature chain-of-thought mentor of LLM[J]. arXiv preprint, arXiv: 2401.00426, 2023
    [4]
    Rawte V, Sheth A, Das A. A survey of hallucination in large foundation models[J]. arXiv preprint, arXiv: 2309.05922, 2023
    [5]
    Zhang Yue, Li Yafu, Cui Leyang, et al. Siren’s song in the AI ocean: A survey on hallucination in large language models[J]. arXiv preprint, arXiv: 2309.01219, 2023
    [6]
    Zhu Zhiying, Sun Zhiqing, Yang Yiming. HaluEval-Wild: Evaluating hallucinations of language models in the wild[J]. arXiv preprint, arXiv: 2403.04307, 2024
    [7]
    Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459−9474
    [8]
    Finardi P, Avila L, Castaldoni R, et al. The chronicles of RAG: The retriever, the chunk and the generator[J]. arXiv preprint, arXiv: 2401.07883, 2024
    [9]
    田永林,王兴霞,王雨桐,等. RAG-PHI:检索增强生成驱动的平行人与平行智能[J]. 智能科学与技术学报,2024,6(1):41−51

    Tian Yonglin, Wang Xingxia, Wang Yutong, et al. RAG-PHI: Retrieval-enhanced generation-driven parallel humans and parallel intelligence[J]. Journal of Intelligent Science and Technology, 2024, 6(1): 41−51 (in Chinese)
    [10]
    Gao Yunfan, Xiong Yun, Gao Xinyu, et al. Retrieval-augmented generation for large language models: A survey[J]. arXiv preprint, arXiv: 2312.10997, 2023
    [11]
    田萱,吴志超. 基于信息检索的知识库问答综述[J/OL]. 计算机研究与发展. [2025-02-17]. http://kns.cnki.net/kcms/detail/11.1777.TP.20240805.1345.004.html

    Tian Xuan, Wu Zhichao. Review of knowledge base question answering based on information retrieval[J/OL]. Journal of Computer Research and Development. [2025-02-17]. http://kns.cnki.net/kcms/detail/11.1777.TP.20240805.1345.004.html (in Chinese)
    [12]
    Anantha R, Bethi T, Vodianik D, et al. Context tuning for retrieval augmented generation[J]. arXiv preprint, arXiv: 2312.05708, 2023
    [13]
    Ortiz L J, Olaya A G, Borrajo D. A dynamic sliding window approach for activity recognition[C]//Proc of the 19th Int Conf on user modeling, adaption and personalization (UMAP 2011). Berlin: Springer, 2011: 219−230
    [14]
    Packer C, Fang V, Patil S G, et al. MemGPT: Towards LLMs as operating systems[J]. arXiv preprint, arXiv: 2310.08560, 2023
    [15]
    Xu Peng, Ping Wei, Wu Xianchao, et al. Retrieval meets long context large language models[J]. arXiv preprint, arXiv: 2310.03025, 2023
    [16]
    韩炳涛,刘涛. 大模型关键技术与应用[J]. 中兴通讯技术,2024,30(2):76−88 doi: 10.12142/ZTETJ.202402012

    Han Bingtao, Liu Tao. Key technologies and applications of large models[J]. ZTE Communications, 2024, 30(2): 76−88 (in Chinese) doi: 10.12142/ZTETJ.202402012
    [17]
    Chen H, Xu Fangyuan, Arora S, et al. Understanding retrieval augmentation for long-form question answering[J]. arXiv preprint, arXiv: 2310.12150, 2023
    [18]
    Chor B, Kushilevitz E, Goldreich O, et al. Private information retrieval[J]. Journal of the ACM, 1998, 45(6): 965−981 doi: 10.1145/293347.293350
    [19]
    Luo Ziyang, Xu Can, Zhao Pu, et al. Augmented large language models with parametric knowledge guiding[J]. arXiv preprint, arXiv: 2305.04757, 2023
    [20]
    He Xiaoxin, Tian Yijun, Sun Yifei, et al. G-Retriever: Retrieval-augmented generation for textual graph understanding and question answering[J]. arXiv preprint, arXiv: 2402.07630, 2024
    [21]
    Teja R. Evaluating the ideal chunk size for a RAG system using LlamaIndex[EB/OL]. [2024-04-01]. https://www.llamaindex.ai/blog/evaluating-the-ideal-chunk-size-for-a-rag-system-using-llamaindex-6207e5d3fec5
    [22]
    LangChain. Recursively split by character[EB/OL]. [2024-05-01]. https://python.langchain.com/docs/modules/dataconnection/documenttransformers/recursivetextsplitter
    [23]
    Dhuliawala S, Komeili M, Xu Jing, et al. Chain-of-verification reduces hallucination in large language models[J]. arXiv preprint, arXiv: 2309.11495, 2023
    [24]
    Ma Xinbei, Gong Yeyun, He Pengcheng, et al. Query rewriting for retrieval-augmented large language models[J]. arXiv preprint, arXiv: 2305.14283, 2023
    [25]
    Peng Wenjun, Li Guiyang, Jiang Yue, et al. Large language model based long-tail query rewriting in Taobao search[J]. arXiv preprint, arXiv: 2311.03758, 2023
    [26]
    张嘉睿,张豈明,毕枫林,等. 基于IPEX-LLM的本地轻量化课程教学智能辅助系统[J]. 华东师范大学学报:自然科学版,2024(5):162−172

    Zhang Jiarui, Zhang Qiming, Bi Fenglin, et al. Locally lightweight course teaching-assistant system based on IPEX-LLM[J]. Journal of East China Normal University: Natural Science, 2024(5): 162−172 (in Chinese)
    [27]
    LangChain. MultiQueryRetriever documentation[EB/OL]. [2024-05-11]. https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/MultiQueryRetriever
    [28]
    Hugging Face. Open LLM leaderboard[EB/OL]. [2024-05-01]. https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
    [29]
    Bai Yushi, Lv Xin, Zhang Jiajie, et al. Longbench: A bilingual, multitask benchmark for long context understanding[J]. arXiv preprint, arXiv: 2308.14508, 2023
    [30]
    Shaham U, Segal E, Ivgi M, et al. Scrolls: Standardized comparison over long language sequences[J]. arXiv preprint, arXiv: 2201.03533, 2022
    [31]
    Kočisk`y T, Schwarz J, Blunsom P, et al. The narrativeqa reading comprehension challenge[J]. Transactions of the Association for Computational Linguistics, 2018, 6: 317−328 doi: 10.1162/tacl_a_00023
    [32]
    Dasigi P, Lo K, Beltagy I, et al. A dataset of information-seeking questions and answers anchored in research papers[J]. arXiv preprint, arXiv: 2105.03011, 2021
    [33]
    Yang Zhilin, Qi Peng, Zhang Saizheng, et al. HotpotQA: A dataset for diverse, explainable multi-hop question answering[J]. arXiv preprint, arXiv: 1809.09600, 2018
    [34]
    LangChain. RecursiveCharacterTextSplitter - LangChain API documentation[EB/OL]. [2024-05-01]. https://api.python.langchain.com/en/latest/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html
    [35]
    OpenAI. GPT−3.5 Turbo model documentation[EB/OL]. [2024-07-01]. https://platform.openai.com/docs/models/gpt-3-5-turbo
    [36]
    Fu Yao. Challenges in deploying long-context transformers: A theoretical peak performance analysis[J]. arXiv preprint, arXiv: 2405.08944, 2024
    [37]
    Lewis P. , Perez E. , Piktus A. , et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459-9474
    [38]
    Guu K, Lee K, Tung Z, et al. REALM: retrieval-augmented language model pre-training[J]. arXiv preprint, arXiv: 2002.08909, 2020
    [39]
    冯杨洋,汪庆,谢旻晖,等. 从BERT到ChatGPT:大模型训练中的存储系统挑战与技术发展[J]. 计算机研究与发展,2024,61(4):809−823 doi: 10.7544/issn1000-1239.202330554

    Feng Yangyang, Wang Qing, Xie Wenhui, et al. From BERT to ChatGPT: Challenges and technological developments in storage systems for large model training[J]. Journal of Computer Research and Development, 2024, 61(4): 809−823 (in Chinese) doi: 10.7544/issn1000-1239.202330554
  • Related Articles

    [1]Li Dongwen, Zhong Zhenyu, Sun Yufei, Shen Junyu, Ma Zizhi, Yu Chuanyue, Zhang Yuzhi. LingLong: A High-Quality Small-Scale Chinese Pre-trained Language Model[J]. Journal of Computer Research and Development, 2025, 62(3): 682-693. DOI: 10.7544/issn1000-1239.202330844
    [2]Jiang Yi, Yang Yong, Yin Jiali, Liu Xiaolei, Li Jiliang, Wang Wei, Tian Youliang, Wu Yingcai, Ji Shouling. A Survey on Security and Privacy Risks in Large Language Models[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440265
    [3]Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551
    [4]Yi Xiaoyuan, Xie Xing. Unpacking the Ethical Value Alignment in Big Models[J]. Journal of Computer Research and Development, 2023, 60(9): 1926-1945. DOI: 10.7544/issn1000-1239.202330553
    [5]Feng Jun, Shi Yichen, Gao Yuhao, He Jingjing, Yu Zitong. Domain Adaptation for Face Anti-Spoofing Based on Dual Disentanglement and Liveness Feature Progressive Alignment[J]. Journal of Computer Research and Development, 2023, 60(8): 1727-1739. DOI: 10.7544/issn1000-1239.202330251
    [6]Zeng Weixin, Zhao Xiang, Tang Jiuyang, Tan Zhen, Wang Wei. Iterative Entity Alignment via Re-Ranking[J]. Journal of Computer Research and Development, 2020, 57(7): 1460-1471. DOI: 10.7544/issn1000-1239.2020.20190643
    [7]Shi Haihe, Zhou Weixing. Design and Implementation of Pairwise Sequence Alignment Algorithm Components Based on Dynamic Programming[J]. Journal of Computer Research and Development, 2019, 56(9): 1907-1917. DOI: 10.7544/issn1000-1239.2019.20180835
    [8]Jia Xibin, Jin Ya, Chen Juncheng. Domain Alignment Based on Multi-Viewpoint Domain-Shared Feature for Cross-Domain Sentiment Classification[J]. Journal of Computer Research and Development, 2018, 55(11): 2439-2451. DOI: 10.7544/issn1000-1239.2018.20170496
    [9]Wang Yuquan, Wen Lijie, Yan Zhiqiang. Alignment Based Conformance Checking Algorithm for BPMN 2.0 Model[J]. Journal of Computer Research and Development, 2017, 54(9): 1920-1930. DOI: 10.7544/issn1000-1239.2017.20160756
    [10]Zhuang Yan, Li Guoliang, Feng Jianhua. A Survey on Entity Alignment of Knowledge Base[J]. Journal of Computer Research and Development, 2016, 53(1): 165-192. DOI: 10.7544/issn1000-1239.2016.20150661
  • Cited by

    Periodical cited type(0)

    Other cited types(6)

Catalog

    Article views (41) PDF downloads (23) Cited by(6)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return