Abstract:
Column-storage model has outstanding performance in read-only data warehouse applications. Many researches show that, for typical OLAP (online analytical processing) queries, column-storage database has better performance than traditional row-storage database. In this paper, according to the characters of column-storage model and its particular data processing pattern, the authors propose a simulative column-storage model based on the traditional row-storage relational model. In the simulative column-storage model, they reorganize and then store each column of all the original tables as a new independent relational table, and provide a data processing model similar to existing column-storage data processing to reduce I/O cost of query processing. Additionally, they also propose optimized simulative column-storage models based on clustering relative columns and full indexed model respectively for improving the performance of OLAP queries especially. Experiments among five data storage models show that, the simulative column-storage model has good performance both in data accessing and in OLAP query processing, and the overall performance is between traditional row-storage model and existing column-storage model. However, in real applications, users may need large additional investment for deploying an almost new system of existing column-storage model to improve the query performance, and the simulative column-storage model based on the traditional row-storage relational model can distinctly reduce the cost of system redeployment. Specially, it provides better performance for OLAP applications.