• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Pan Fengfeng, Xiong Jin. NV-Shuffle: Shuffle Based on Non-Volatile Memory[J]. Journal of Computer Research and Development, 2018, 55(2): 229-245. DOI: 10.7544/issn1000-1239.2018.20170742
Citation: Pan Fengfeng, Xiong Jin. NV-Shuffle: Shuffle Based on Non-Volatile Memory[J]. Journal of Computer Research and Development, 2018, 55(2): 229-245. DOI: 10.7544/issn1000-1239.2018.20170742

NV-Shuffle: Shuffle Based on Non-Volatile Memory

More Information
  • Published Date: January 31, 2018
  • In the popular big data processing platforms like Spark, it is common to collect data in a many-to-many fashion during a stage traditionally known as the Shuffle phase. Data exchange happens across different types of tasks or stages via Shuffle phase. And during this phase, the data need to be transferred via network and persisted into traditional disk-based file system. Hence, the efficiency of Shuffle phase is one of the key factors in the performance of the big data processing. In order to reducing I/O overheads, we propose an optimized Shuffle strategy based on Non-Volatile Memory (NVM)—NV-Shuffle. Next-generation non-volatile memory (NVM) technologies, such as Phase Change Memory (PCM), Spin-Transfer Torque Magnetic Memories (STTMs) introduce new opportunities for reducing I/O overhead, due to their non-volatility, high read/write performance, low energy, etc. In the big data processing platform based on memory computing such as Spark, Shuffle data access based on disks is an important factor of application performance, NV-Shuffle uses NVM as persist memory to store Shuffle data and employs direct data accesses like memory by introducing NV-Buffer to organize data instead of traditional file system.We implemented NV-Shuffle in Spark. Our performance results show, NV-shuffle reduces job execution time by 10%~40% for Shuffle-heavy workloads.
  • Related Articles

    [1]Li Yan, Yang Sile, Liu Chengchun, Wang Linmei, Tian Yaolin, Zhang Xinhang, Zhu Yu, Li Chunpu, Sun Lei, Yan Shengen, Xiao Limin, Zhang Weifeng. Resilio: An Elastic Fault-tolerant Training System for Large Language Models[J]. Journal of Computer Research and Development. DOI: 10.7544/issn1000-1239.202550147
    [2]Yang Hongzhang, Yang Yahui, Tu Yaofeng, Sun Guangyu, Wu Zhonghai. Proactive Fault Tolerance Based on “Collection—Prediction—Migration—Feedback” Mechanism[J]. Journal of Computer Research and Development, 2020, 57(2): 306-317. DOI: 10.7544/issn1000-1239.2020.20190549
    [3]Bi Yahui, Jiang Suyang, Wang Zhigang, Leng Fangling, Bao Yubin, Yu Ge, Qian Ling. A Multi-Level Fault Tolerance Mechanism for Disk-Resident Pregel-Like Systems[J]. Journal of Computer Research and Development, 2016, 53(11): 2530-2541. DOI: 10.7544/issn1000-1239.2016.20150619
    [4]He Wangquan, Wei Di, Quan Jianxiao, Wu Wei, Qi Fengbin. Dynamic Task Scheduling Model and Fault-Tolerant via Queuing Theory[J]. Journal of Computer Research and Development, 2016, 53(6): 1271-1280. DOI: 10.7544/issn1000-1239.2016.20148445
    [5]Zhou Jun, Li Huawei, Wang Tiancheng, Li Xiaowei. A Lightweight Fine-Grained Fault-Tolerant Scheme for 3D Networks-on-Chip[J]. Journal of Computer Research and Development, 2016, 53(2): 341-353. DOI: 10.7544/issn1000-1239.2016.20148436
    [6]Li Leisheng, Wang Chaowei, Ma Zhitao, Huo Zhigang, Tian Rong. petaPar: A Scalable and Fault Tolerant Petascale Free Mesh Simulation System[J]. Journal of Computer Research and Development, 2015, 52(4): 823-832. DOI: 10.7544/issn1000-1239.2015.20131332
    [7]Yi Huizhan, Wang Feng, Zuo Ke, Yang Canqun, Du Yunfei, Ma Yaqing. Asynchronous Checkpoint/Restart Based on Memory Buffer[J]. Journal of Computer Research and Development, 2014, 51(6): 1229-1239.
    [8]Han Jianjun, Gan Lu, Ruan Youlin, Li Qinghua, Abbas A.Essa. Real-Time Dynamic Scheduling Algorithms for the Savings of Power Consumption and Fault Tolerance in Multi-Processor Computing Environment[J]. Journal of Computer Research and Development, 2008, 45(4): 706-715.
    [9]Zhang Sanfeng and Wu Guoxin. A Fault-Tolerant Asymmetric DHT Method Towards Dynamic and Heterogeneous Network[J]. Journal of Computer Research and Development, 2007, 44(6): 905-913.
    [10]Cheng Xin, Liu Hongwei, Dong Jian, Yang Xiaozong. A Fault Tolerance Deadlock Detection/Resolution Algorithm for the AND-OR Model[J]. Journal of Computer Research and Development, 2007, 44(5): 798-805.
  • Cited by

    Periodical cited type(3)

    1. 张婷,李文敬,黄帆. 基于多核PC的MAP记录表冲突规避算法. 计算机工程与设计. 2020(12): 3419-3424 .
    2. 张瑞聪,任鹏程,房凯,张卫山. Hadoop环境下分布式物联网设备状态分析处理系统. 计算机系统应用. 2019(12): 79-85 .
    3. 涂云山,储佳佳,张耀,翁楚良. 面向新硬件的数据处理软件技术. 华东师范大学学报(自然科学版). 2018(05): 30-40+78 .

    Other cited types(6)

Catalog

    Article views (1215) PDF downloads (800) Cited by(9)

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return