Citation: | Wang Zirui, Jiang Dejun. Key Techniques of Swapping Mechanism Based on Ultra-Low Latency SSD[J]. Journal of Computer Research and Development, 2024, 61(3): 557-570. DOI: 10.7544/issn1000-1239.202330538 |
With the rapid increase of memory-intensive applications, memory capacity is playing an increasingly prominent role in application requirements. However, particle density puts constraints on the DRAM memory capacity scalability. The swapping mechanism, known as a common memory-expansion technology, is to temporarily store less-used memory pages in devices to expand memory. In the past, the disk’s read/write speed was the main limit to prevent the wide adoption of the swapping mechanism. In recent years, with the rapid development of ultra-low latency SSDs, the swapping mechanism can take advantage of its low-latency read and write characteristics to improve the efficiency of swapping. The I/O stack of the swapping approach, however, has a significant software overhead with low I/O latency. We analyze and evaluate the Linux swapping mechanism using ultra-low latency SSDs and design Ultraswap, a swapping mechanism based on ultra-low latency SSDs. Ultraswap adds the processing of polling requests to the Linux I/O stack and reduces the I/O merging and scheduling overhead to achieve a lightweight I/O stack. Based on Ultraswap’s I/O stack, the swap-in and swap-out paths of the kernel swapping mechanism are further optimized. By optimizing the handling of faulted pages and direct memory recycling, the time overhead on the critical path of the swapping mechanism is reduced. The results show that Ultraswap can improve the average performance by 19% compared with Linux swapping mechanism; with 20% of local memory, Ultraswap can achieve a 33% performance improvement, effectively reducing the time overhead on the critical path of the swapping path.
[1] |
Lee S H. Technology scaling challenges and opportunities of memory devices [C/OL] //Proc of the 62nd IEEE Int Electron Devices Meeting. Piscataway, NJ: IEEE, 2016[2023-12-25].https://ieeexplore.ieee.org/document/7838026
|
[2] |
Harris B, Altiparmak N. Ultra-low latency SSDs’ impact on overall energy efficiency [C/OL] //Proc of the 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’20). Berkeley, CA: USENIX Association, 2020[2023-12-25].https://www.usenix.org/system/files/hotstorage20_paper_harris.pdf
|
[3] |
Intel. 3D XPoint: A breakthrough in non-volatile memory technology [EB/OL]. 2023[2023-12-25].https://www.intel.com/content/www/us/en/architecture-and-technology/intel-micron-3d-xpoint-webcast.html?wapkw=3D-Xpoint
|
[4] |
Samsung. Ultra-low latency with Samsung Z-NAND SSD [EB/OL]. 2017[2023-12-25].https://download.semiconductor.samsung.com/resources/brochure/Ultra-Low%20Latency%20with%20Samsung%20Z-NAND%20SSD.pdf
|
[5] |
Intel. Intel Optane DC SSD series [EB/OL]. 2023[2023-12-25].https://www.intel.com/content/www/us/en/products/details/memory-storage/data-center-ssds/optane-dc-ssd-series.html
|
[6] |
Samsung. The ultra-low latency SSD, Z-SSD [EB/OL]. 2018[2023-12-25].https://semiconductor.samsung.com/newsroom/tech-blog/the-ultra-low-latency-ssd-z-ssd
|
[7] |
Lee G, Shin S, Song W, et al. Asynchronous I/O stack: A low-latency kernel I/O stack for ultra-low latency SSDs [C] //Proc of the 2019 USENIX Annual Technical Conf (ATC’19). Berkeley, CA: USENIX Association, 2019: 603−616
|
[8] |
Zhang Jie, Kwon M, Gouk D, et al. FlashShare: Punching through server storage stack from kernel to firmware for ultra-low latency SSDs [C] //Proc of the 13th USENIX Symp on Operating Systems Design and Implementation (OSDI’18). Berkeley, CA: USENIX Association, 2018: 477−492
|
[9] |
Mitchell C, Geng Yifeng, Li Jinyang. Using one-sided RDMA reads to build a fast, CPU-efficient key-value store [C] //Proc of the 2013 USENIX Annual Technical Conf (ATC’13). Berkeley, CA: USENIX Association, 2013: 103−114
|
[10] |
Saxena M, Swift M M. FlashVM: Virtual memory management on flash [C/OL] //Proc of the 2010 USENIX Annual Technical Conf (ATC’10). Berkeley, CA: USENIX Association, 2010[2023-12-25].https://dl.acm.org/doi/abs/10.5555/1855840.1855854
|
[11] |
Fedorov V, Kim J, Qin Mian, et al. Speculative paging for future NVM storage [C] //Proc of the 2017 Int Symp on Memory Systems. New York: ACM, 2017: 399–410
|
[12] |
Gu Juncheng, Lee Y, Zhang Yiwen, et al. Efficient memory disaggregation with Infiniswap [C] //Proc of the 14th USENIX Symp on Networked Systems Design and Implementation (NSDI’17). Berkeley, CA: USENIX Association, 2017: 649−667
|
[13] |
Amaro E, Branner-Augmon C, Luo Zhihong, et al. Can far memory improve job throughput [C/OL] //Proc of the 15th European Conf on Computer Systems (EuroSys’20). New York: ACM, 2020[2023-12-25].https://dl.acm.org/doi/pdf/10.1145/3342195.3387522
|
[14] |
Maruf H A, Chowdhury M. Effectively prefetching remote memory with Leap [C] //Proc of the 2020 USENIX Annual Technical Conf (ATC’20). Berkeley, CA: USENIX Association, 2020: 843−857
|
[15] |
Wang Chenxi, Qiao Yifan, Ma Haoran, et al. Canvas: Isolated and adaptive swapping for multi-applications on remote memory [C] //Proc of the 20th USENIX Symp on Networked Systems Design and Implementation (NSDI’23). Berkeley, CA: USENIX Association, 2023: 161−179
|
[16] |
Harris B, Altiparmak N. When poll is more energy efficient than interrupt [C] //Proc of the 14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage’22). New York: ACM, 2022: 59–64
|
[17] |
Whitaker C, Sundar S, Harris B, et al. Do we still need IO schedulers for low-latency disks [C] //Proc of the 15th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage’23). New York: ACM, 2023: 44−50
|
[18] |
屠要峰,韩银俊,金浩,等. UStore:面向新型硬件的统一存储系统[J]. 计算机研究与发展,2023,60(3):525−538 doi: 10.7544/issn1000-1239.202220503
Tu Yaofeng, Han Yinjun, Jin Hao, et al. UStore: Unified storage system for advanced hardware[J]. Journal of Computer Research and Development, 2023, 60(3): 525−538(in Chinese) doi: 10.7544/issn1000-1239.202220503
|
[19] |
Tavakkol A, Sadrosadati M, Ghose S, et al. Flin: Enabling fairness and enhancing performance in modern NVME solid state drives [C] //Proc of the 45th Annual Int Symp on Computer Architecture (ISCA’18). Piscataway, NJ: IEEE, 2018: 397−410
|
[20] |
Lee G, Jin W, Song W, et al. A case for hardware-based demand paging [C] //Proc of the 47th Annual Int Symp on Computer Architecture (ISCA’20). Piscataway, NJ: IEEE, 2020: 1103−1116
|
[21] |
Bjørling M, Axboe J, Nellans D, et al. Linux block IO: Introducing multi-queue SSD access on multi-core systems [C/OL] //Proc of the 6th Int Systems and Storage Conf. New York: ACM, 2013[2023-12-25].https://kernel.dk/systor13-final18.pdf
|
[1] | Yu Ruiqi, Zhang Xinyun, Ren Shuang. A Review of Quantum Machine Learning Algorithms Based on Variational Quantum Circuit[J]. Journal of Computer Research and Development, 2025, 62(4): 821-851. DOI: 10.7544/issn1000-1239.202330979 |
[2] | Qian Luoxiong, Chen Mei, Ma Xueyan, Zhang Chi, Zhang Jinhong. Multi-View Clustering Based on Adaptive Tensor Singular Value Shrinkage[J]. Journal of Computer Research and Development, 2025, 62(3): 733-750. DOI: 10.7544/issn1000-1239.202330785 |
[3] | Pan Shijie, Gao Fei, Wan Linchun, Qin Sujuan, Wen Qiaoyan. Quantum Algorithm for Spectral Regression[J]. Journal of Computer Research and Development, 2021, 58(9): 1835-1842. DOI: 10.7544/issn1000-1239.2021.20210366 |
[4] | Yu Runlong, Zhao Hongke, Wang Zhong, Ye Yuyang, Zhang Peining, Liu Qi, Chen Enhong. Negatively Correlated Search with Asymmetry for Real-Parameter Optimization Problems[J]. Journal of Computer Research and Development, 2019, 56(8): 1746-1757. DOI: 10.7544/issn1000-1239.2019.20190198 |
[5] | Zhang Cheng, Wang Dong, Shen Chuan, Cheng Hong, Chen Lan, Wei Sui. Separable Compressive Imaging Method Based on Singular Value Decomposition[J]. Journal of Computer Research and Development, 2016, 53(12): 2816-2823. DOI: 10.7544/issn1000-1239.2016.20150414 |
[6] | Ning Xin, Li Weijun, Li Haoguang, Liu Wenjie. Uncorrelated Locality Preserving Discriminant Analysis Based on Bionics[J]. Journal of Computer Research and Development, 2016, 53(11): 2623-2629. DOI: 10.7544/issn1000-1239.2016.20150630 |
[7] | Zhao Feng, Huang Qingming, Gao Wen. An Image Matching Algorithm Based on Singular Value Decomposition[J]. Journal of Computer Research and Development, 2010, 47(1): 23-32. |
[8] | Lin Yuan, Luo Siwei, and Yang Liner. Recommendation-Based Grid Resource Matching Algorithm[J]. Journal of Computer Research and Development, 2009, 46(11): 1814-1820. |
[9] | Sun Yong, Wu Bo, and Feng Yanpeng. A Policy-and Value- Iteration Algorithm for POMDP[J]. Journal of Computer Research and Development, 2008, 45(10): 1763-1768. |
[10] | Zhang Shihui, Kong Lingfu, and Feng Liang. An Improved Hestenes SVD Method and Its Parallel Computing and Application in Parallel Robot[J]. Journal of Computer Research and Development, 2008, 45(4): 716-724. |
1. |
白婷,刘轩宁,吴斌,张梓滨,徐志远,林康熠. 基于多粒度特征交叉剪枝的点击率预测模型. 计算机研究与发展. 2024(05): 1290-1298 .
![]() | |
2. |
李莎莎,崔铁军. 系统故障演化过程中故障事件发生概率的修正方法研究. 安全与环境学报. 2024(06): 2068-2074 .
![]() | |
3. |
苗忠琦,童向荣. 一种偏差和方差双降的双鲁棒去偏学习模型. 小型微型计算机系统. 2024(11): 2663-2672 .
![]() |