• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Xu Danya, Wang Jing, Wang Li, Zhang Weigong. A Cross-Layer Memory Tracing Toolkit for Big Data Application Based on Spark[J]. Journal of Computer Research and Development, 2020, 57(6): 1179-1190. DOI: 10.7544/issn1000-1239.2020.20200109
Citation: Xu Danya, Wang Jing, Wang Li, Zhang Weigong. A Cross-Layer Memory Tracing Toolkit for Big Data Application Based on Spark[J]. Journal of Computer Research and Development, 2020, 57(6): 1179-1190. DOI: 10.7544/issn1000-1239.2020.20200109

A Cross-Layer Memory Tracing Toolkit for Big Data Application Based on Spark

Funds: This work was supported by the National Natural Science Foundation of China (61772350), the Beijing Nova Program (Z181100006218093), the Research Fund from Beijing Innovation Center for Future Chips (KYJJ2018008), the Construction Plan of Beijing High-level Teacher Team (CIT&TCD201704082), and the Capacity Building for Sci-Tech Innovation Fundamental Scientific Research Funds (19530050173).
More Information
  • Published Date: May 31, 2020
  • Spark has been increasingly employed by industries for big data analytics recently, due to its efficient in-memory distributed programming model. Most existing optimization and analysis tool of Spark perform at either application layer or operating system layer separately, which makes Spark semantics separate from the underlying actions. For example, unknowing the impaction of operating system parameters on performance of Spark layer will lead unknowing of how to use OS parameters to tune system performance. In this paper, we propose SMTT, a new Spark memory tracing toolkit, which establishes the semantics of the upper application and the underlying physical hardware across Spark layer, JVM layer and OS layer. Based on the characteristics of Spark memory, we design the tracking scheme of execution memory and storage memory respectively. Then we analyze the Spark iterative calculation process and execution/storage memory usage by SMTT. The experiment of RDD memory assessment analysis shows our toolkit could be effectively used on performance analysis and provide guides for optimization of Spark memory system.
  • Cited by

    Periodical cited type(2)

    1. 辛明勇,徐长宝,祝健杨,王宇,刘德宏. 基于改进DE算法的电力多核异构芯片能耗优化技术. 自动化技术与应用. 2024(09): 85-88 .
    2. DU YongPing,JIN XingNan,HAN HongGui,WANG LuLin. Reusable electronic products value prediction based on reinforcement learning. Science China(Technological Sciences). 2022(07): 1578-1586 .

    Other cited types(2)

Catalog

    Article views (1261) PDF downloads (699) Cited by(4)
    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return