Abstract:
By maintaining the information from root node to current node, the position relationship between nodes of an XML document can be determined efficiently by comparing their path labels, such that the overall performance of XML query processing can be improved significantly. Moreover, a good storage strategy for path labels can not only improve the utility ratio of disk space, but also reduce the costly IO operation. In this paper, an optimal storage strategy for static path labeling scheme is proposed to tackle this problem. The basic idea is that when storing the components of path labels, they are assigned with different prefixes according to the sum of frequencies of the region they belong to, thus can reduce the storage space efficiently. Compared with existing methods, the prefixes for components of path labels are not determined according to pre-specified prefixes, which are too inflexible to utilize the frequency information of different components to reduce the storage space. The experimental results verify the feasibility and effectiveness of the proposed storage strategy.