Pu Yonglin, Yu Jiong, Lu Liang, Li Ziyang, Guo Binglei, Liao Bin. Energy-Efficient Strategy Based on Data Recovery in Storm[J]. Journal of Computer Research and Development, 2021, 58(3): 479-496. DOI: 10.7544/issn1000-1239.2021.20200489
Citation:
Pu Yonglin, Yu Jiong, Lu Liang, Li Ziyang, Guo Binglei, Liao Bin. Energy-Efficient Strategy Based on Data Recovery in Storm[J]. Journal of Computer Research and Development, 2021, 58(3): 479-496. DOI: 10.7544/issn1000-1239.2021.20200489
Pu Yonglin, Yu Jiong, Lu Liang, Li Ziyang, Guo Binglei, Liao Bin. Energy-Efficient Strategy Based on Data Recovery in Storm[J]. Journal of Computer Research and Development, 2021, 58(3): 479-496. DOI: 10.7544/issn1000-1239.2021.20200489
Citation:
Pu Yonglin, Yu Jiong, Lu Liang, Li Ziyang, Guo Binglei, Liao Bin. Energy-Efficient Strategy Based on Data Recovery in Storm[J]. Journal of Computer Research and Development, 2021, 58(3): 479-496. DOI: 10.7544/issn1000-1239.2021.20200489
1(School of Information Science and Engineering, Xinjiang University, Urumqi 830046)
2(School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300)
3(College of Statistics and Data Science, Xinjiang University of Finance and Economics, Urumqi 830012)
Funds: This work was supported by the National Natural Science Foundation of China (61862060, 61462079, 61562086, 61562078), the Research Innovation Project of Graduate Student in Xinjiang Uygur Autonomous Region (XJ2019G038), and the Doctoral Innovation Program of Xinjiang University (XJUBSCX-201902).
As one of the most popular platforms in big data stream computing, Storm is developed for high performance in the design process, which ignores the problem of high energy consumption and restricts the development of the platform. Aiming at this problem, the task allocation model, the topology information monitoring model, the data recovery model, and the energy consumption model are set up. Moreover, an energy-efficient strategy based on data recovery in Storm(DR-Storm) is proposed. The proposed strategy is composed of the throughput detection algorithm and the data recovery algorithm. According to the topology information, the throughput detection algorithm calculates cluster throughput which is feedbacked by the topology information monitoring model and estimates whether the task in cluster topology should be terminated by information feedback. The data recovery algorithm selects a backup node for data storage according to the data recovery model and estimates whether cluster topology is appropriate for data recovery by the feedback of the topology information monitoring model. In addition, the DR-Storm recovers data within the cluster topology from the memory of the backup node. We evaluate the DR-Storm by measuring the cluster latency as well as the energy consumption efficiency in a big data stream computing environment. The experimental results show that the proposed strategy can reduce cluster latency and power while the energy consumption is saved efficiently compared with existing researches.