An Efficient Greedy Algorithm for Schema Matching
-
Graphical Abstract
-
Abstract
Schema matching is the task of finding semantic correspondences between elements of two schemas, which plays a key role in many database applications, such as data integration, electronic commerce, data warehouse, semantic query processing, and XML message exchange, etc. Especially, it is a basic research issue in metadata management. Unfortunately, it still remains largely a manual, labor-intensive, and expensive process. In this paper, the schema matching problem is treated as a combinatorial problem. Firstly, schemas are transformed into multi-labeled graphs, which are the internal schema model for schema matching. Therefore, the schema matching problem is reduced to the labeled graph matching problem. Secondly, a generic graph similarity measure is discussed, which uses the labels of nodes and the edges to compute the similarity between the two schemas. Then, an objective function based on the multi-labeled graph similarity is proposed. Based on the objective function, a greedy matching algorithm is designed to find the desired matching state for schema matching. A prominent characteristic of this method is that the algorithm combines the feasible matching information to obtain optimal matching. Finally, some schema samples are used to test the greedy matching algorithm. The test results confirm that the algorithm is effective, which can obtain mapping results with high quality.
-
-