• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Zhao Hui, Yang Shuqiang, Chen Zhikun, Yin Hong, and Jin Songchang. Optimization of Range Queries and Analysis for MapReduce Systems[J]. Journal of Computer Research and Development, 2014, 51(3): 606-617.
Citation: Zhao Hui, Yang Shuqiang, Chen Zhikun, Yin Hong, and Jin Songchang. Optimization of Range Queries and Analysis for MapReduce Systems[J]. Journal of Computer Research and Development, 2014, 51(3): 606-617.

Optimization of Range Queries and Analysis for MapReduce Systems

More Information
  • Published Date: March 14, 2014
  • Recently, MapReduce parallel computing paradigm has gained extensive attention from industry and academia. MapReduce works well in Google, Yahoo! and Facebook on massive data processing. However, MapReduce-based systems originally were used to manage massive un-structured and semi-structured data, such as inverted indexing, Web page ranking, log analyzing etc. They ignored the optimizing of structured data, such as the brute-force scanning, which is inefficient for some common workloads in structured data management, such as select, filter etc. For this problem, we introdue a global indexing technology, which has been widely used in database, aiming to optimizing queries and analysis in a range of the overall dataset. Global index will help reduce redundant map tasks, resulting in decreasing the cost of I/O and scheduling. Finally, we evaluate the effect of our framework by four data selection ratios which are 80%, 50%, 30% and 10% under different cluster sizes. We find that the response time has 5x improvement at most, I/O cost improves 10x at most and cost of scheduling improves 11x at most.

Catalog

    Article views (856) PDF downloads (637) Cited by()
    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return