高级检索
    于利胜 张延松 王 珊 张 倩. 基于行存储模型的模拟列存储策略研究[J]. 计算机研究与发展, 2010, 47(5): 878-885.
    引用本文: 于利胜 张延松 王 珊 张 倩. 基于行存储模型的模拟列存储策略研究[J]. 计算机研究与发展, 2010, 47(5): 878-885.
    Yu Lisheng, Zhang Yansong, Wang Shan, and Zhang Qian. Research on Simulative Column-Storage Model Policy Based on Row-Storage Model[J]. Journal of Computer Research and Development, 2010, 47(5): 878-885.
    Citation: Yu Lisheng, Zhang Yansong, Wang Shan, and Zhang Qian. Research on Simulative Column-Storage Model Policy Based on Row-Storage Model[J]. Journal of Computer Research and Development, 2010, 47(5): 878-885.

    基于行存储模型的模拟列存储策略研究

    Research on Simulative Column-Storage Model Policy Based on Row-Storage Model

    • 摘要: 列存储模型在只读的数据仓库应用中表现出非常好的性能,很多研究表明对于典型的OLAP查询,列存储数据库的性能大大优于行存储数据库.根据列存储模型的特性及数据处理特点,在传统的行存储模型关系数据库中模拟列存储的存储模式及数据处理过程,并通过优化的基于聚类的列存储模型、全索引模型与典型的行存储方式进行类比性能测试.实验结果显示,采用传统行存储模型模拟的列存储模型针对OLAP类查询具有很高的数据访问速度及查询性能,整体性能介于行存储数据库与列存储数据库之间.对于实际应用系统来说,该方案减少了整体系统部署代价,而且无需为提高OLAP查询的性能增加额外系统(列存储)投入,从而为OLAP应用提供了良好的性能支持.

       

      Abstract: Column-storage model has outstanding performance in read-only data warehouse applications. Many researches show that, for typical OLAP (online analytical processing) queries, column-storage database has better performance than traditional row-storage database. In this paper, according to the characters of column-storage model and its particular data processing pattern, the authors propose a simulative column-storage model based on the traditional row-storage relational model. In the simulative column-storage model, they reorganize and then store each column of all the original tables as a new independent relational table, and provide a data processing model similar to existing column-storage data processing to reduce I/O cost of query processing. Additionally, they also propose optimized simulative column-storage models based on clustering relative columns and full indexed model respectively for improving the performance of OLAP queries especially. Experiments among five data storage models show that, the simulative column-storage model has good performance both in data accessing and in OLAP query processing, and the overall performance is between traditional row-storage model and existing column-storage model. However, in real applications, users may need large additional investment for deploying an almost new system of existing column-storage model to improve the query performance, and the simulative column-storage model based on the traditional row-storage relational model can distinctly reduce the cost of system redeployment. Specially, it provides better performance for OLAP applications.

       

    /

    返回文章
    返回