Abstract:
Non-fully structured query (NFS) is an important query approach for the XML documents lacking in full structure information. NFS query faces the situations: the user doesn’t know fully the structural knowledge of an XML document, or a document doesn’t provide any structural information, or documents are heterogeneous. However, the user can describe his querying requirement by an NFS query containing a part of XML structural information, or some keywords only. The issue of meaningful results determination is critical to the quality of NFS query. Based on the PE model for XML data of tree model in the authors, previous work, a graph-based meaningful determination model of NFS query results for XML data of graph model, called GPE, is proposed. the GPE model mainly includes the result’s granularity, the definition of pattern and entity, the definition of equivalent pattern, and determination rules. For the ambiguous label and complicate structural semantic, an equivalent pattern in the GPE is evaluated by combining a domain-dictionary-based and context-constricted label similarity with a pattern structure similarity. Such equivalent pattern evaluation can improve greatly the precision of meaningful results determination. With the extensive experiments on both the real dataset and XML benchmark, the GPE outperforms the PE model on both the recall and the precision.