A Retrieval-Augmented Generation System Based on a Sliding Window Strategy in Large Language Models

Bi Fenglin; Zhang Qiming; Zhang Jiarui; Wang Yantong; Chen Yang; Zhang Yanbin; Wang Wei; Zhou Xuan

doi:10.7544/issn1000-1239.202440411

Journal of Computer Research and Development > 2025 > Accepted Manuscript > DOI: 10.7544/issn1000-1239.202440411 CSTR: 32373.14.issn1000-1239.202440411

Bi Fenglin, Zhang Qiming, Zhang Jiarui, Wang Yantong, Chen Yang, Zhang Yanbin, Wang Wei, Zhou Xuan. A Retrieval-Augmented Generation System Based on a Sliding Window Strategy in Large Language Models[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202440411

Citation:

PDF (1594 KB)

A Retrieval-Augmented Generation System Based on a Sliding Window Strategy in Large Language Models

1.
School of Data Science and Engineering, East China Normal University, Shanghai 200062
2.
School of Computer Science, Fudan University, Shanghai 200433
3.
Shanghai Key Laboratory of Intelligent Information Processing (Fudan University), Shanghai 200433

Funds: This work was supported by the General Program of the National Natural Science Foundation of China (62137001, 62277017, 61977026).

More Information

Author Bio:
Bi Fenglin: born in 1993.PhD candidate, student member of CCF. His main research interests include large language models, open source software supply chain and ecology, and software bots

Zhang Qiming: born in 1999.Master candidate, student member of CCF. His main research interests include large language model applications and computational pedagogy

Zhang Jiarui: born in 2000.Master candidate. His main research interests include large language models and computational pedagogy

Wang Yantong: born in 2000.Master candidate. His main reserach interests include large language models, and open source software ecology

Chen Yang: born in 1981.PhD, professor, PhD supervisor, senior member of ACM,IEEE and CCF.His main research interests include computer networks,social computing and big data analytics

Zhang Yanbin: born in 1983.Master, engineer, master supervisor, member of CCF. Her main research interests include computational pedagogy and software engineering

Wang Wei: born in 1979.PhD, professor, PhD supervisor, member of CCF. His main research interests include open source measurement and digital ecosystems

Zhou Xuan: born in 1979. PhD, professor, PhD supervisor, member of CCF. His research interests include high performance database and information retrieval
Received Date: May 30, 2024
Revised Date: February 20, 2025
Accepted Date: March 02, 2025
Available Online: March 02, 2025

Graphical Abstract

Abstract

Abstract

Leveraging a sliding window strategy, this study presents an innovative retrieval-augmented generation system aimed at enhancing the factual accuracy and reliability of outputs from large language models (LLMs). By applying a sliding window mechanism during the indexing phase, the project effectively addresses the limitations of fixed context window sizes and static retrieval methods. Three specific sliding window strategies have been proposed to efficiently process and segment texts: Fixed Window Size and Fixed Step Length Split (FFS), Dynamic Window Size and Fixed Step Length Split (DFS), and Dynamic Window Size and Dynamic Step Length Split (DDS). To further enhance retrieval accuracy and relevance, the project employs multiple advanced query techniques, including query expansion and reformulation. Rigorous experimental evaluations were conducted using the state-of-the-art Llama-3 model across multiple diverse datasets, encompassing both general knowledge and domain-specific corpora. Results demonstrated optimal performance with a carefully calibrated block size of 1 024 tokens and a step size of 3, significantly improving the F1 score across various tasks. This configuration highlighted the critical importance of balancing document segment length and sliding window step size to maximize information retention and retrieval efficacy.The sliding window strategy effectively preserves contextual information, reduces information loss, and exhibits adaptability across different datasets and query types.
- retrieval-augmented generation,
- sliding window mechanism,
- large language models,
- information retrieval,
- document question answering

FullText(HTML)

References (39)

References

[1]	Zhao W X, Zhou Kun, Li Junyi, et al. A survey of large language models[J]. arXiv preprint, arXiv: 2303.18223, 2023
[2]	Taffa A T, Usbeck R. Leveraging LLMs in scholarly knowledge graph question answering[J]. arXiv preprint, arXiv: 2311.09841, 2023
[3]	Wang Chaojie, Xu Yishi, Peng Zhong, et al. Keqing: Knowledge-based question answering is a nature chain-of-thought mentor of LLM[J]. arXiv preprint, arXiv: 2401.00426, 2023
[4]	Rawte V, Sheth A, Das A. A survey of hallucination in large foundation models[J]. arXiv preprint, arXiv: 2309.05922, 2023
[5]	Zhang Yue, Li Yafu, Cui Leyang, et al. Siren’s song in the AI ocean: A survey on hallucination in large language models[J]. arXiv preprint, arXiv: 2309.01219, 2023
[6]	Zhu Zhiying, Sun Zhiqing, Yang Yiming. HaluEval-Wild: Evaluating hallucinations of language models in the wild[J]. arXiv preprint, arXiv: 2403.04307, 2024
[7]	Lewis P, Perez E, Piktus A, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459−9474
[8]	Finardi P, Avila L, Castaldoni R, et al. The chronicles of RAG: The retriever, the chunk and the generator[J]. arXiv preprint, arXiv: 2401.07883, 2024
[9]	田永林,王兴霞,王雨桐,等. RAG-PHI:检索增强生成驱动的平行人与平行智能[J]. 智能科学与技术学报,2024,6(1):41−51 Tian Yonglin, Wang Xingxia, Wang Yutong, et al. RAG-PHI: Retrieval-enhanced generation-driven parallel humans and parallel intelligence[J]. Journal of Intelligent Science and Technology, 2024, 6(1): 41−51 (in Chinese)
[10]	Gao Yunfan, Xiong Yun, Gao Xinyu, et al. Retrieval-augmented generation for large language models: A survey[J]. arXiv preprint, arXiv: 2312.10997, 2023
[11]	田萱,吴志超. 基于信息检索的知识库问答综述[J/OL]. 计算机研究与发展. [2025-02-17]. http://kns.cnki.net/kcms/detail/11.1777.TP.20240805.1345.004.html Tian Xuan, Wu Zhichao. Review of knowledge base question answering based on information retrieval[J/OL]. Journal of Computer Research and Development. [2025-02-17]. http://kns.cnki.net/kcms/detail/11.1777.TP.20240805.1345.004.html (in Chinese)
[12]	Anantha R, Bethi T, Vodianik D, et al. Context tuning for retrieval augmented generation[J]. arXiv preprint, arXiv: 2312.05708, 2023
[13]	Ortiz L J, Olaya A G, Borrajo D. A dynamic sliding window approach for activity recognition[C]//Proc of the 19th Int Conf on user modeling, adaption and personalization (UMAP 2011). Berlin: Springer, 2011: 219−230
[14]	Packer C, Fang V, Patil S G, et al. MemGPT: Towards LLMs as operating systems[J]. arXiv preprint, arXiv: 2310.08560, 2023
[15]	Xu Peng, Ping Wei, Wu Xianchao, et al. Retrieval meets long context large language models[J]. arXiv preprint, arXiv: 2310.03025, 2023
[16]	韩炳涛,刘涛. 大模型关键技术与应用[J]. 中兴通讯技术,2024,30(2):76−88 doi: 10.12142/ZTETJ.202402012 Han Bingtao, Liu Tao. Key technologies and applications of large models[J]. ZTE Communications, 2024, 30(2): 76−88 (in Chinese) doi: 10.12142/ZTETJ.202402012
[17]	Chen H, Xu Fangyuan, Arora S, et al. Understanding retrieval augmentation for long-form question answering[J]. arXiv preprint, arXiv: 2310.12150, 2023
[18]	Chor B, Kushilevitz E, Goldreich O, et al. Private information retrieval[J]. Journal of the ACM, 1998, 45(6): 965−981 doi: 10.1145/293347.293350
[19]	Luo Ziyang, Xu Can, Zhao Pu, et al. Augmented large language models with parametric knowledge guiding[J]. arXiv preprint, arXiv: 2305.04757, 2023
[20]	He Xiaoxin, Tian Yijun, Sun Yifei, et al. G-Retriever: Retrieval-augmented generation for textual graph understanding and question answering[J]. arXiv preprint, arXiv: 2402.07630, 2024
[21]	Teja R. Evaluating the ideal chunk size for a RAG system using LlamaIndex[EB/OL]. [2024-04-01]. https://www.llamaindex.ai/blog/evaluating-the-ideal-chunk-size-for-a-rag-system-using-llamaindex-6207e5d3fec5
[22]	LangChain. Recursively split by character[EB/OL]. [2024-05-01]. https://python.langchain.com/docs/modules/dataconnection/documenttransformers/recursivetextsplitter
[23]	Dhuliawala S, Komeili M, Xu Jing, et al. Chain-of-verification reduces hallucination in large language models[J]. arXiv preprint, arXiv: 2309.11495, 2023
[24]	Ma Xinbei, Gong Yeyun, He Pengcheng, et al. Query rewriting for retrieval-augmented large language models[J]. arXiv preprint, arXiv: 2305.14283, 2023
[25]	Peng Wenjun, Li Guiyang, Jiang Yue, et al. Large language model based long-tail query rewriting in Taobao search[J]. arXiv preprint, arXiv: 2311.03758, 2023
[26]	张嘉睿,张豈明,毕枫林,等. 基于IPEX-LLM的本地轻量化课程教学智能辅助系统[J]. 华东师范大学学报:自然科学版,2024(5):162−172 Zhang Jiarui, Zhang Qiming, Bi Fenglin, et al. Locally lightweight course teaching-assistant system based on IPEX-LLM[J]. Journal of East China Normal University: Natural Science, 2024(5): 162−172 (in Chinese)
[27]	LangChain. MultiQueryRetriever documentation[EB/OL]. [2024-05-11]. https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/MultiQueryRetriever
[28]	Hugging Face. Open LLM leaderboard[EB/OL]. [2024-05-01]. https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
[29]	Bai Yushi, Lv Xin, Zhang Jiajie, et al. Longbench: A bilingual, multitask benchmark for long context understanding[J]. arXiv preprint, arXiv: 2308.14508, 2023
[30]	Shaham U, Segal E, Ivgi M, et al. Scrolls: Standardized comparison over long language sequences[J]. arXiv preprint, arXiv: 2201.03533, 2022
[31]	Kočisk`y T, Schwarz J, Blunsom P, et al. The narrativeqa reading comprehension challenge[J]. Transactions of the Association for Computational Linguistics, 2018, 6: 317−328 doi: 10.1162/tacl_a_00023
[32]	Dasigi P, Lo K, Beltagy I, et al. A dataset of information-seeking questions and answers anchored in research papers[J]. arXiv preprint, arXiv: 2105.03011, 2021
[33]	Yang Zhilin, Qi Peng, Zhang Saizheng, et al. HotpotQA: A dataset for diverse, explainable multi-hop question answering[J]. arXiv preprint, arXiv: 1809.09600, 2018
[34]	LangChain. RecursiveCharacterTextSplitter - LangChain API documentation[EB/OL]. [2024-05-01]. https://api.python.langchain.com/en/latest/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html
[35]	OpenAI. GPT−3.5 Turbo model documentation[EB/OL]. [2024-07-01]. https://platform.openai.com/docs/models/gpt-3-5-turbo
[36]	Fu Yao. Challenges in deploying long-context transformers: A theoretical peak performance analysis[J]. arXiv preprint, arXiv: 2405.08944, 2024
[37]	Lewis P. , Perez E. , Piktus A. , et al. Retrieval-augmented generation for knowledge-intensive NLP tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459-9474
[38]	Guu K, Lee K, Tung Z, et al. REALM: retrieval-augmented language model pre-training[J]. arXiv preprint, arXiv: 2002.08909, 2020
[39]	冯杨洋,汪庆,谢旻晖,等. 从BERT到ChatGPT:大模型训练中的存储系统挑战与技术发展[J]. 计算机研究与发展,2024,61(4):809−823 doi: 10.7544/issn1000-1239.202330554 Feng Yangyang, Wang Qing, Xie Wenhui, et al. From BERT to ChatGPT: Challenges and technological developments in storage systems for large model training[J]. Journal of Computer Research and Development, 2024, 61(4): 809−823 (in Chinese) doi: 10.7544/issn1000-1239.202330554

[1]	Lai Yuanming, Li Yalong, Hu Hanzhi, Xie Mengyao, Wang Zhe, Wu Chenggang. Optimizing Cross-Architecture Programming Model Adaptation in SIMD-to-RVV Dynamic Binary Translation[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202550135
[2]	Yu Zihao, Chen Lu, Sun Ninghui, Bao Yungang. Quality Optimization Method of Dynamic Binary Translation Code Targeting for RISC-V[J]. Journal of Computer Research and Development, 2023, 60(10): 2322-2334. DOI: 10.7544/issn1000-1239.202220296
[3]	Fu Liguo, Pang Jianmin, Wang Jun, Zhang Jiahao, Yue Feng. Formal Model of Correctness and Optimization on Binary Translation[J]. Journal of Computer Research and Development, 2019, 56(9): 2001-2011. DOI: 10.7544/issn1000-1239.2019.20180513
[4]	Fu Liguo, Pang Janming, Wang Jun, Zhang Jiahao, Yue Feng. Optimization of Library Function Disposing in Dynamic Binary Translation[J]. Journal of Computer Research and Development, 2019, 56(8): 1783-1791. DOI: 10.7544/issn1000-1239.2019.20170871
[5]	Guo Yanchao, Gao Ling, Wang Hai, Zheng Jie, Ren Jie. Power Optimization Based on Dynamic Content Refresh in Mobile Edge Computing[J]. Journal of Computer Research and Development, 2018, 55(3): 563-571. DOI: 10.7544/issn1000-1239.2018.20170716
[6]	Zhang Shiwen, Li Zhiyong, Chen Shaomiao, and Li Renfa. Dynamic Multi-Objective Optimization Algorithm Based on Ecological Strategy[J]. Journal of Computer Research and Development, 2014, 51(6): 1313-1330.
[7]	Liu Chun'an, Wang Yuping. Dynamic Multi-Objective Optimization Evolutionary Algorithm Based on New Model[J]. Journal of Computer Research and Development, 2008, 45(4): 603-611.
[8]	Li Jianhui, Ma Xiangning, Zhu Chuanqi. Dynamic Binary Translation and Optimization[J]. Journal of Computer Research and Development, 2007, 44(1): 161-168.
[9]	Dou Quansheng, Zhou Chunguang, Xu Zhongyu, Pan Guanyu. Swarm-Core Evolutionary Particle Swarm Optimization in Dynamic Optimization Environments[J]. Journal of Computer Research and Development, 2006, 43(1): 89-95.
[10]	Ma Xiangning, Wu Chenggang, Tang Feng, Feng Xiaobing, and Zhang Zhaoqing. Two Condition Code Optimization Approaches in Binary Translation[J]. Journal of Computer Research and Development, 2005, 42(2): 329-337.