高级检索

    一种有效的贪婪模式匹配算法

    An Efficient Greedy Algorithm for Schema Matching

    • 摘要: 模式匹配问题是意图获得两个模式中所包含个体对象之间的语义匹配和映射,其结果表示源模式的个体对象与目标模式的个体对象之间存在特定的语义关联.它在数据库应用领域起到关键性的作用,例如数据集成、电子商务、数据仓库、XML消息交换等,特别地,它已成为元数据管理的基本问题.然而,模式匹配很大程度上依赖人工的操作,是一个费时费力的过程.模式匹配问题可以归约为一个组合优化问题:多标记图匹配问题.首先,将模式表示为多标记图,将模式匹配转换为多标记图匹配问题.其次,提出多标记图的相似性度量方法,进而提出基于多标记图相似性的模式匹配目标优化函数.最后,在这个目标函数基础上设计实现了一个贪婪匹配算法,其最显著的特点是综合多种可用的标记信息,灵活准确地获得最优的匹配结果.

       

      Abstract: Schema matching is the task of finding semantic correspondences between elements of two schemas, which plays a key role in many database applications, such as data integration, electronic commerce, data warehouse, semantic query processing, and XML message exchange, etc. Especially, it is a basic research issue in metadata management. Unfortunately, it still remains largely a manual, labor-intensive, and expensive process. In this paper, the schema matching problem is treated as a combinatorial problem. Firstly, schemas are transformed into multi-labeled graphs, which are the internal schema model for schema matching. Therefore, the schema matching problem is reduced to the labeled graph matching problem. Secondly, a generic graph similarity measure is discussed, which uses the labels of nodes and the edges to compute the similarity between the two schemas. Then, an objective function based on the multi-labeled graph similarity is proposed. Based on the objective function, a greedy matching algorithm is designed to find the desired matching state for schema matching. A prominent characteristic of this method is that the algorithm combines the feasible matching information to obtain optimal matching. Finally, some schema samples are used to test the greedy matching algorithm. The test results confirm that the algorithm is effective, which can obtain mapping results with high quality.

       

    /

    返回文章
    返回