Citation: | Wang Zirui, Jiang Dejun. Key Techniques of Swapping Mechanism Based on Ultra-Low Latency SSD[J]. Journal of Computer Research and Development, 2024, 61(3): 557-570. DOI: 10.7544/issn1000-1239.202330538 |
With the rapid increase of memory-intensive applications, memory capacity is playing an increasingly prominent role in application requirements. However, particle density puts constraints on the DRAM memory capacity scalability. The swapping mechanism, known as a common memory-expansion technology, is to temporarily store less-used memory pages in devices to expand memory. In the past, the disk’s read/write speed was the main limit to prevent the wide adoption of the swapping mechanism. In recent years, with the rapid development of ultra-low latency SSDs, the swapping mechanism can take advantage of its low-latency read and write characteristics to improve the efficiency of swapping. The I/O stack of the swapping approach, however, has a significant software overhead with low I/O latency. We analyze and evaluate the Linux swapping mechanism using ultra-low latency SSDs and design Ultraswap, a swapping mechanism based on ultra-low latency SSDs. Ultraswap adds the processing of polling requests to the Linux I/O stack and reduces the I/O merging and scheduling overhead to achieve a lightweight I/O stack. Based on Ultraswap’s I/O stack, the swap-in and swap-out paths of the kernel swapping mechanism are further optimized. By optimizing the handling of faulted pages and direct memory recycling, the time overhead on the critical path of the swapping mechanism is reduced. The results show that Ultraswap can improve the average performance by 19% compared with Linux swapping mechanism; with 20% of local memory, Ultraswap can achieve a 33% performance improvement, effectively reducing the time overhead on the critical path of the swapping path.
[1] |
Lee S H. Technology scaling challenges and opportunities of memory devices [C/OL] //Proc of the 62nd IEEE Int Electron Devices Meeting. Piscataway, NJ: IEEE, 2016[2023-12-25].https://ieeexplore.ieee.org/document/7838026
|
[2] |
Harris B, Altiparmak N. Ultra-low latency SSDs’ impact on overall energy efficiency [C/OL] //Proc of the 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’20). Berkeley, CA: USENIX Association, 2020[2023-12-25].https://www.usenix.org/system/files/hotstorage20_paper_harris.pdf
|
[3] |
Intel. 3D XPoint: A breakthrough in non-volatile memory technology [EB/OL]. 2023[2023-12-25].https://www.intel.com/content/www/us/en/architecture-and-technology/intel-micron-3d-xpoint-webcast.html?wapkw=3D-Xpoint
|
[4] |
Samsung. Ultra-low latency with Samsung Z-NAND SSD [EB/OL]. 2017[2023-12-25].https://download.semiconductor.samsung.com/resources/brochure/Ultra-Low%20Latency%20with%20Samsung%20Z-NAND%20SSD.pdf
|
[5] |
Intel. Intel Optane DC SSD series [EB/OL]. 2023[2023-12-25].https://www.intel.com/content/www/us/en/products/details/memory-storage/data-center-ssds/optane-dc-ssd-series.html
|
[6] |
Samsung. The ultra-low latency SSD, Z-SSD [EB/OL]. 2018[2023-12-25].https://semiconductor.samsung.com/newsroom/tech-blog/the-ultra-low-latency-ssd-z-ssd
|
[7] |
Lee G, Shin S, Song W, et al. Asynchronous I/O stack: A low-latency kernel I/O stack for ultra-low latency SSDs [C] //Proc of the 2019 USENIX Annual Technical Conf (ATC’19). Berkeley, CA: USENIX Association, 2019: 603−616
|
[8] |
Zhang Jie, Kwon M, Gouk D, et al. FlashShare: Punching through server storage stack from kernel to firmware for ultra-low latency SSDs [C] //Proc of the 13th USENIX Symp on Operating Systems Design and Implementation (OSDI’18). Berkeley, CA: USENIX Association, 2018: 477−492
|
[9] |
Mitchell C, Geng Yifeng, Li Jinyang. Using one-sided RDMA reads to build a fast, CPU-efficient key-value store [C] //Proc of the 2013 USENIX Annual Technical Conf (ATC’13). Berkeley, CA: USENIX Association, 2013: 103−114
|
[10] |
Saxena M, Swift M M. FlashVM: Virtual memory management on flash [C/OL] //Proc of the 2010 USENIX Annual Technical Conf (ATC’10). Berkeley, CA: USENIX Association, 2010[2023-12-25].https://dl.acm.org/doi/abs/10.5555/1855840.1855854
|
[11] |
Fedorov V, Kim J, Qin Mian, et al. Speculative paging for future NVM storage [C] //Proc of the 2017 Int Symp on Memory Systems. New York: ACM, 2017: 399–410
|
[12] |
Gu Juncheng, Lee Y, Zhang Yiwen, et al. Efficient memory disaggregation with Infiniswap [C] //Proc of the 14th USENIX Symp on Networked Systems Design and Implementation (NSDI’17). Berkeley, CA: USENIX Association, 2017: 649−667
|
[13] |
Amaro E, Branner-Augmon C, Luo Zhihong, et al. Can far memory improve job throughput [C/OL] //Proc of the 15th European Conf on Computer Systems (EuroSys’20). New York: ACM, 2020[2023-12-25].https://dl.acm.org/doi/pdf/10.1145/3342195.3387522
|
[14] |
Maruf H A, Chowdhury M. Effectively prefetching remote memory with Leap [C] //Proc of the 2020 USENIX Annual Technical Conf (ATC’20). Berkeley, CA: USENIX Association, 2020: 843−857
|
[15] |
Wang Chenxi, Qiao Yifan, Ma Haoran, et al. Canvas: Isolated and adaptive swapping for multi-applications on remote memory [C] //Proc of the 20th USENIX Symp on Networked Systems Design and Implementation (NSDI’23). Berkeley, CA: USENIX Association, 2023: 161−179
|
[16] |
Harris B, Altiparmak N. When poll is more energy efficient than interrupt [C] //Proc of the 14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage’22). New York: ACM, 2022: 59–64
|
[17] |
Whitaker C, Sundar S, Harris B, et al. Do we still need IO schedulers for low-latency disks [C] //Proc of the 15th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage’23). New York: ACM, 2023: 44−50
|
[18] |
屠要峰,韩银俊,金浩,等. UStore:面向新型硬件的统一存储系统[J]. 计算机研究与发展,2023,60(3):525−538 doi: 10.7544/issn1000-1239.202220503
Tu Yaofeng, Han Yinjun, Jin Hao, et al. UStore: Unified storage system for advanced hardware[J]. Journal of Computer Research and Development, 2023, 60(3): 525−538(in Chinese) doi: 10.7544/issn1000-1239.202220503
|
[19] |
Tavakkol A, Sadrosadati M, Ghose S, et al. Flin: Enabling fairness and enhancing performance in modern NVME solid state drives [C] //Proc of the 45th Annual Int Symp on Computer Architecture (ISCA’18). Piscataway, NJ: IEEE, 2018: 397−410
|
[20] |
Lee G, Jin W, Song W, et al. A case for hardware-based demand paging [C] //Proc of the 47th Annual Int Symp on Computer Architecture (ISCA’20). Piscataway, NJ: IEEE, 2020: 1103−1116
|
[21] |
Bjørling M, Axboe J, Nellans D, et al. Linux block IO: Introducing multi-queue SSD access on multi-core systems [C/OL] //Proc of the 6th Int Systems and Storage Conf. New York: ACM, 2013[2023-12-25].https://kernel.dk/systor13-final18.pdf
|
[1] | Zhang Yuhong, Zhi Wenwu, Li Peipei, Hu Xuegang. Semi-Supervised Method for Cross-Lingual Word Embedding Based on an Adversarial Model with Double Discriminators[J]. Journal of Computer Research and Development, 2023, 60(9): 2127-2136. DOI: 10.7544/issn1000-1239.202220036 |
[2] | Liu Linfeng, Yu Zixing, Zhu He. A Link Prediction Method Based on Gated Recurrent Units for Mobile Social Network[J]. Journal of Computer Research and Development, 2023, 60(3): 705-716. DOI: 10.7544/issn1000-1239.202110432 |
[3] | Ma Ang, Yu Yanhua, Yang Shengli, Shi Chuan, Li Jie, Cai Xiuxiu. Survey of Knowledge Graph Based on Reinforcement Learning[J]. Journal of Computer Research and Development, 2022, 59(8): 1694-1722. DOI: 10.7544/issn1000-1239.20211264 |
[4] | Wang Honglin, Yang Dan, Nie Tiezheng, Kou Yue. Attributed Heterogeneous Information Network Embedding with Self-Attention Mechanism for Product Recommendation[J]. Journal of Computer Research and Development, 2022, 59(7): 1509-1521. DOI: 10.7544/issn1000-1239.20210016 |
[5] | Yang Yanjie, Wang Li, Wang Yuhang. Rumor Detection Based on Source Information and Gating Graph Neural Network[J]. Journal of Computer Research and Development, 2021, 58(7): 1412-1424. DOI: 10.7544/issn1000-1239.2021.20200801 |
[6] | Zhang Shenglin, Li Dongwen, Sun Yongqian, Meng Weibin, Zhang Yuzhe, Zhang Yuzhi, Liu Ying, Pei Dan. Unified Anomaly Detection for Syntactically Diverse Logs in Cloud Datacenter[J]. Journal of Computer Research and Development, 2020, 57(4): 778-790. DOI: 10.7544/issn1000-1239.2020.20190875 |
[7] | Fang Yang, Zhao Xiang, Tan Zhen, Yang Shiyu, Xiao Weidong. A Revised Translation-Based Method for Knowledge Graph Representation[J]. Journal of Computer Research and Development, 2018, 55(1): 139-150. DOI: 10.7544/issn1000-1239.2018.20160723 |
[8] | Yang Lin, Zhang Libo, Luo Tiejian, Wan Qiyang, Wu Yanjun. Knowledge Schematization Method Based on Link and Semantic Relationship[J]. Journal of Computer Research and Development, 2017, 54(8): 1655-1664. DOI: 10.7544/issn1000-1239.2017.20170177 |
[9] | Liu Zhiyuan, Sun Maosong, Lin Yankai, Xie Ruobing. Knowledge Representation Learning: A Review[J]. Journal of Computer Research and Development, 2016, 53(2): 247-261. DOI: 10.7544/issn1000-1239.2016.20160020 |
[10] | Wang Yanshi, Wang Wei, Liu Zhaohui, Wei Jun, Huang Tao. A Mechanism for Transparent Data Caching[J]. Journal of Computer Research and Development, 2015, 52(4): 907-917. DOI: 10.7544/issn1000-1239.2015.20131910 |
1. |
张学旺,雷响. 基于层次化群签名的联盟链身份隐私保护方案. 信息安全研究. 2024(12): 1160-1164 .
![]() | |
2. |
夏莹杰,朱思雨,刘雪娇. 区块链架构下具有条件隐私的车辆编队跨信任域高效群组认证研究. 通信学报. 2023(04): 111-123 .
![]() |