Citation: | Bi Fenglin, Zhang Qiming, Zhang Jiarui, Wang Yantong, Chen Yang, Zhang Yanbin, Wang Wei, Zhou Xuan. A Retrieval-Augmented Generation System Based on a Sliding Window Strategy in Large Language Models[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440411 |
Leveraging a sliding window strategy, this study presents an innovative retrieval-augmented generation system aimed at enhancing the factual accuracy and reliability of outputs from large language models (LLMs). By applying a sliding window mechanism during the indexing phase, the project effectively addresses the limitations of fixed context window sizes and static retrieval methods. Three specific sliding window strategies have been proposed to efficiently process and segment texts: Fixed Window Size and Fixed Step Length Split (FFS), Dynamic Window Size and Fixed Step Length Split (DFS), and Dynamic Window Size and Dynamic Step Length Split (DDS). To further enhance retrieval accuracy and relevance, the project employs multiple advanced query techniques, including query expansion and reformulation. Rigorous experimental evaluations were conducted using the state-of-the-art Llama-3 model across multiple diverse datasets, encompassing both general knowledge and domain-specific corpora. Results demonstrated optimal performance with a carefully calibrated block size of 1 024 tokens and a step size of 3, significantly improving the F1 score across various tasks. This configuration highlighted the critical importance of balancing document segment length and sliding window step size to maximize information retention and retrieval efficacy.The sliding window strategy effectively preserves contextual information, reduces information loss, and exhibits adaptability across different datasets and query types.
[1] |
Zhao W X, Zhou Kun, Li Junyi, et al. A survey of large language models[J]. arXiv preprint, arXiv: 2303.18223, 2023
|
[2] |
Taffa A T, Usbeck R. Leveraging LLMs in scholarly knowledge graph question answering[J]. arXiv preprint, arXiv: 2311.09841, 2023
|
[3] |
Wang Chaojie, Xu Yishi, Peng Zhong, et al. Keqing: Knowledge-based question answering is a nature chain-of-thought mentor of LLM[J]. arXiv preprint, arXiv: 2401.00426, 2023
|
[4] |
Rawte V, Sheth A, Das A. A survey of hallucination in large foundation models[J]. arXiv preprint, arXiv: 2309.05922, 2023
|
[5] |
Zhang Yue, Li Yafu, Cui Leyang, et al. Siren’s song in the AI ocean: A survey on hallucination in large language models[J]. arXiv preprint, arXiv: 2309.01219, 2023
|
[6] |
Zhu Zhiying, Sun Zhiqing, Yang Yiming. HaluEval-Wild: Evaluating hallucinations of language models in the wild[J]. arXiv preprint, arXiv: 2403.04307, 2024
|
[7] |
Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459−9474
|
[8] |
Finardi P, Avila L, Castaldoni R, et al. The chronicles of RAG: The retriever, the chunk and the generator[J]. arXiv preprint, arXiv: 2401.07883, 2024
|
[9] |
田永林,王兴霞,王雨桐,等. RAG-PHI:检索增强生成驱动的平行人与平行智能[J]. 智能科学与技术学报,2024,6(1):41−51
Tian Yonglin, Wang Xingxia, Wang Yutong, et al. RAG-PHI: Retrieval-enhanced generation-driven parallel humans and parallel intelligence[J]. Journal of Intelligent Science and Technology, 2024, 6(1): 41−51 (in Chinese)
|
[10] |
Gao Yunfan, Xiong Yun, Gao Xinyu, et al. Retrieval-augmented generation for large language models: A survey[J]. arXiv preprint, arXiv: 2312.10997, 2023
|
[11] |
田萱,吴志超. 基于信息检索的知识库问答综述[J/OL]. 计算机研究与发展. [2025-02-17]. http://kns.cnki.net/kcms/detail/11.1777.TP.20240805.1345.004.html
Tian Xuan, Wu Zhichao. Review of knowledge base question answering based on information retrieval[J/OL]. Journal of Computer Research and Development. [2025-02-17]. http://kns.cnki.net/kcms/detail/11.1777.TP.20240805.1345.004.html (in Chinese)
|
[12] |
Anantha R, Bethi T, Vodianik D, et al. Context tuning for retrieval augmented generation[J]. arXiv preprint, arXiv: 2312.05708, 2023
|
[13] |
Ortiz L J, Olaya A G, Borrajo D. A dynamic sliding window approach for activity recognition[C]//Proc of the 19th Int Conf on user modeling, adaption and personalization (UMAP 2011). Berlin: Springer, 2011: 219−230
|
[14] |
Packer C, Fang V, Patil S G, et al. MemGPT: Towards LLMs as operating systems[J]. arXiv preprint, arXiv: 2310.08560, 2023
|
[15] |
Xu Peng, Ping Wei, Wu Xianchao, et al. Retrieval meets long context large language models[J]. arXiv preprint, arXiv: 2310.03025, 2023
|
[16] |
韩炳涛,刘涛. 大模型关键技术与应用[J]. 中兴通讯技术,2024,30(2):76−88 doi: 10.12142/ZTETJ.202402012
Han Bingtao, Liu Tao. Key technologies and applications of large models[J]. ZTE Communications, 2024, 30(2): 76−88 (in Chinese) doi: 10.12142/ZTETJ.202402012
|
[17] |
Chen H, Xu Fangyuan, Arora S, et al. Understanding retrieval augmentation for long-form question answering[J]. arXiv preprint, arXiv: 2310.12150, 2023
|
[18] |
Chor B, Kushilevitz E, Goldreich O, et al. Private information retrieval[J]. Journal of the ACM, 1998, 45(6): 965−981 doi: 10.1145/293347.293350
|
[19] |
Luo Ziyang, Xu Can, Zhao Pu, et al. Augmented large language models with parametric knowledge guiding[J]. arXiv preprint, arXiv: 2305.04757, 2023
|
[20] |
He Xiaoxin, Tian Yijun, Sun Yifei, et al. G-Retriever: Retrieval-augmented generation for textual graph understanding and question answering[J]. arXiv preprint, arXiv: 2402.07630, 2024
|
[21] |
Teja R. Evaluating the ideal chunk size for a RAG system using LlamaIndex[EB/OL]. [2024-04-01]. https://www.llamaindex.ai/blog/evaluating-the-ideal-chunk-size-for-a-rag-system-using-llamaindex-6207e5d3fec5
|
[22] |
LangChain. Recursively split by character[EB/OL]. [2024-05-01]. https://python.langchain.com/docs/modules/dataconnection/documenttransformers/recursivetextsplitter
|
[23] |
Dhuliawala S, Komeili M, Xu Jing, et al. Chain-of-verification reduces hallucination in large language models[J]. arXiv preprint, arXiv: 2309.11495, 2023
|
[24] |
Ma Xinbei, Gong Yeyun, He Pengcheng, et al. Query rewriting for retrieval-augmented large language models[J]. arXiv preprint, arXiv: 2305.14283, 2023
|
[25] |
Peng Wenjun, Li Guiyang, Jiang Yue, et al. Large language model based long-tail query rewriting in Taobao search[J]. arXiv preprint, arXiv: 2311.03758, 2023
|
[26] |
张嘉睿,张豈明,毕枫林,等. 基于IPEX-LLM的本地轻量化课程教学智能辅助系统[J]. 华东师范大学学报:自然科学版,2024(5):162−172
Zhang Jiarui, Zhang Qiming, Bi Fenglin, et al. Locally lightweight course teaching-assistant system based on IPEX-LLM[J]. Journal of East China Normal University: Natural Science, 2024(5): 162−172 (in Chinese)
|
[27] |
LangChain. MultiQueryRetriever documentation[EB/OL]. [2024-05-11]. https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/MultiQueryRetriever
|
[28] |
Hugging Face. Open LLM leaderboard[EB/OL]. [2024-05-01]. https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
|
[29] |
Bai Yushi, Lv Xin, Zhang Jiajie, et al. Longbench: A bilingual, multitask benchmark for long context understanding[J]. arXiv preprint, arXiv: 2308.14508, 2023
|
[30] |
Shaham U, Segal E, Ivgi M, et al. Scrolls: Standardized comparison over long language sequences[J]. arXiv preprint, arXiv: 2201.03533, 2022
|
[31] |
Kočisk`y T, Schwarz J, Blunsom P, et al. The narrativeqa reading comprehension challenge[J]. Transactions of the Association for Computational Linguistics, 2018, 6: 317−328 doi: 10.1162/tacl_a_00023
|
[32] |
Dasigi P, Lo K, Beltagy I, et al. A dataset of information-seeking questions and answers anchored in research papers[J]. arXiv preprint, arXiv: 2105.03011, 2021
|
[33] |
Yang Zhilin, Qi Peng, Zhang Saizheng, et al. HotpotQA: A dataset for diverse, explainable multi-hop question answering[J]. arXiv preprint, arXiv: 1809.09600, 2018
|
[34] |
LangChain. RecursiveCharacterTextSplitter - LangChain API documentation[EB/OL]. [2024-05-01]. https://api.python.langchain.com/en/latest/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html
|
[35] |
OpenAI. GPT−3.5 Turbo model documentation[EB/OL]. [2024-07-01]. https://platform.openai.com/docs/models/gpt-3-5-turbo
|
[36] |
Fu Yao. Challenges in deploying long-context transformers: A theoretical peak performance analysis[J]. arXiv preprint, arXiv: 2405.08944, 2024
|
[37] |
Lewis P. , Perez E. , Piktus A. , et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459-9474
|
[38] |
Guu K, Lee K, Tung Z, et al. REALM: retrieval-augmented language model pre-training[J]. arXiv preprint, arXiv: 2002.08909, 2020
|
[39] |
冯杨洋,汪庆,谢旻晖,等. 从BERT到ChatGPT:大模型训练中的存储系统挑战与技术发展[J]. 计算机研究与发展,2024,61(4):809−823 doi: 10.7544/issn1000-1239.202330554
Feng Yangyang, Wang Qing, Xie Wenhui, et al. From BERT to ChatGPT: Challenges and technological developments in storage systems for large model training[J]. Journal of Computer Research and Development, 2024, 61(4): 809−823 (in Chinese) doi: 10.7544/issn1000-1239.202330554
|
[1] | Li Dongwen, Zhong Zhenyu, Sun Yufei, Shen Junyu, Ma Zizhi, Yu Chuanyue, Zhang Yuzhi. LingLong: A High-Quality Small-Scale Chinese Pre-trained Language Model[J]. Journal of Computer Research and Development, 2025, 62(3): 682-693. DOI: 10.7544/issn1000-1239.202330844 |
[2] | Jiang Yi, Yang Yong, Yin Jiali, Liu Xiaolei, Li Jiliang, Wang Wei, Tian Youliang, Wu Yingcai, Ji Shouling. A Survey on Security and Privacy Risks in Large Language Models[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440265 |
[3] | Zhang Naizhou, Cao Wei, Zhang Xiaojian, Li Shijun. Conversation Generation Based on Variational Attention Knowledge Selection and Pre-trained Language Model[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440551 |
[4] | Yi Xiaoyuan, Xie Xing. Unpacking the Ethical Value Alignment in Big Models[J]. Journal of Computer Research and Development, 2023, 60(9): 1926-1945. DOI: 10.7544/issn1000-1239.202330553 |
[5] | Feng Jun, Shi Yichen, Gao Yuhao, He Jingjing, Yu Zitong. Domain Adaptation for Face Anti-Spoofing Based on Dual Disentanglement and Liveness Feature Progressive Alignment[J]. Journal of Computer Research and Development, 2023, 60(8): 1727-1739. DOI: 10.7544/issn1000-1239.202330251 |
[6] | Zeng Weixin, Zhao Xiang, Tang Jiuyang, Tan Zhen, Wang Wei. Iterative Entity Alignment via Re-Ranking[J]. Journal of Computer Research and Development, 2020, 57(7): 1460-1471. DOI: 10.7544/issn1000-1239.2020.20190643 |
[7] | Shi Haihe, Zhou Weixing. Design and Implementation of Pairwise Sequence Alignment Algorithm Components Based on Dynamic Programming[J]. Journal of Computer Research and Development, 2019, 56(9): 1907-1917. DOI: 10.7544/issn1000-1239.2019.20180835 |
[8] | Jia Xibin, Jin Ya, Chen Juncheng. Domain Alignment Based on Multi-Viewpoint Domain-Shared Feature for Cross-Domain Sentiment Classification[J]. Journal of Computer Research and Development, 2018, 55(11): 2439-2451. DOI: 10.7544/issn1000-1239.2018.20170496 |
[9] | Wang Yuquan, Wen Lijie, Yan Zhiqiang. Alignment Based Conformance Checking Algorithm for BPMN 2.0 Model[J]. Journal of Computer Research and Development, 2017, 54(9): 1920-1930. DOI: 10.7544/issn1000-1239.2017.20160756 |
[10] | Zhuang Yan, Li Guoliang, Feng Jianhua. A Survey on Entity Alignment of Knowledge Base[J]. Journal of Computer Research and Development, 2016, 53(1): 165-192. DOI: 10.7544/issn1000-1239.2016.20150661 |