高级检索

    基于互关联后继树的XML索引技术

    • 摘要: 提出了一种新的根树节点编码方法——基于叶序区间的节点编码(LOINS).编码方法只需对根树后序遍历一次即可完成,能实现常数时间内对任意两个树节点间前后代关系的判断.同时,结合互关联后继树模型(IRST)的标引性、可压缩性等特点,提出基于IRST的根树索引模型IsBaRTI-I,及对该模型空间优化的索引模型IsBaRTI-II. IsBaRTI-I,II采用树节点名称(标签)及其在根树(XML文档树)中的出现计数索引节点间的父子关系和节点叶序区间编码,实现索引结构和节点编码的相互统一. IsBaRTI-I,II索引建立时间、空间代价小,可快速查询满足XPath表达式在XML文档树中的节点序列和路径.

       

      Abstract: A new numbering scheme for labeling rooted trees based on leaf order interval numbering scheme (LOINS) is proposed. The nodes in a rooted tree can be encoded in once traversal of the tree, and the ancestor-descendantship among nodes can be determined in constant time based on LOINS. Furthermore, IsBaRTI-I, a novel index for rooted tree structure data model, is proposed which takes the advantages of IRST, such as indexing and compressibility. IsBaRTI-II, the space optimization version of IsBaRTI-I, is also introduced. IsBaRTI-I,II indexes the ancestor-descendantship among nodes and the LOINS number of node by the name (label) of the node and the count of its appearance in the rooted tree. In this way, indexing structure and numbering scheme becomes a unit unity. Theory analysis and experiment result illustrates that IsBaRTI-I,II needs more little time and capacity to be built; the node series and path matching XPath expressions can be obtained more quickly than the previous XML indexes through IsBaRTI-I,II.

       

    /

    返回文章
    返回