Advanced Search
    Liu Xianmin, Li Jianzhong. Discovering Extended Conditional Functional Dependencies[J]. Journal of Computer Research and Development, 2015, 52(1): 130-140. DOI: 10.7544/issn1000-1239.2015.20130691
    Citation: Liu Xianmin, Li Jianzhong. Discovering Extended Conditional Functional Dependencies[J]. Journal of Computer Research and Development, 2015, 52(1): 130-140. DOI: 10.7544/issn1000-1239.2015.20130691

    Discovering Extended Conditional Functional Dependencies

    • eCFD (extended conditional functional dependency) is proposed as the extension of CFD (conditional functional dependency) for data cleaning. Compared with CFD, eCFD can take more patterns of values and catch more semantic information. However, there are only few works about eCFD. This paper focuses on the problem of eCFD discovering, whose counterpart of CFD has been studied very much. As we know, this paper is the first work about eCFD discovering. To avoid inconsistencies and remove redundancies, based on the definitions of strongly validated and weakly non-redundant eCFDs, formal definition of eCFD discovering problem is given and MeCFD method is proposed to solve this problem. MeCFD first generates all basic eCFDs which are weakly non-redundant and semantically equivalent to all strongly validated eCFDs, then constructs compound eCFDs through merging basic eCFDs. Searching candidate space in depth-first order makes MeCFD use only constant memory space to maintain data partitions. Efficient pruning strategies are proposed to improve the performance of MeCFD. Theoretical analysis shows the correctness of MeCFD. Experiments over real data sets show the good scalability of MeCFD and the effectiveness of pruning strategies and optimizing methods.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return