Advanced Search
    Qi Kaiyuan, Han Yanbo, Zhao Zhuofeng, Fang jun. MapReduce Intermediate Result Cache for Concurrent Data Stream Processing[J]. Journal of Computer Research and Development, 2013, 50(1): 111-121.
    Citation: Qi Kaiyuan, Han Yanbo, Zhao Zhuofeng, Fang jun. MapReduce Intermediate Result Cache for Concurrent Data Stream Processing[J]. Journal of Computer Research and Development, 2013, 50(1): 111-121.

    MapReduce Intermediate Result Cache for Concurrent Data Stream Processing

    • With the development of Internet of Things applications, real-time processing of sensor data stream over large scale history data brings a new challenge. The traditional MapReduce programming model is designed for batch-based large-scale data processing and cannot satisfy the real-time requirement. To extend the real-time data processing capability of MapReduce by preprocessing, pipelining and localizing, an immediate result cache for keyvalue data type, which can avoid repeated remote IO overhead and computation cost by taking full use of local memory and storage, localize stream processing by distributing data across the clusters and support frequent reads and writes of data stream processing, needs to be designed. This paper proposes a scalable, extensible and efficient keyvalue intermediate result cache, which consists of Hash B-tree structures and SSTable files. Furthermore, to optimize the high concurrency performance, this paper also devises a probability-based B-tree structure as well as its multiplexing search algorithm through the B-tree balance property, and improves the file readwrite strategy and replacement algorithm by utilization of the overhead estimation and buffered information. The theoretical analysis and benchmark experiments show that the proposed structures and algorithms further optimize the concurrency performance of MapReduce immediate results, and the immediate result cache is effective to support data stream processing over large-scale data.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return