• 中国精品科技期刊
  • CCF推荐A类中文期刊
  • 计算领域高质量科技期刊T1类
Advanced Search
Du Yuefeng, Li Xiaoguang, Song Baoyan. Discovering Consistency Constraints for Associated Data on Heterogeneous Schemas[J]. Journal of Computer Research and Development, 2020, 57(9): 1939-1948. DOI: 10.7544/issn1000-1239.2020.20190570
Citation: Du Yuefeng, Li Xiaoguang, Song Baoyan. Discovering Consistency Constraints for Associated Data on Heterogeneous Schemas[J]. Journal of Computer Research and Development, 2020, 57(9): 1939-1948. DOI: 10.7544/issn1000-1239.2020.20190570

Discovering Consistency Constraints for Associated Data on Heterogeneous Schemas

Funds: This work was supported by the National Natural Science Foundation of China (U1811261), the Project of Liaoning Provincial Public Opinion and Network Security Big Data System Engineering Laboratory, and the Natural Science Foundation of Liaoning Province.
More Information
  • Published Date: August 31, 2020
  • Data consistency is a central issue of data quality management. With capability of expressing data relationship abstractly and formally, constraints are a technique for data consistency management. However, the diversity on heterogeneous schemas from multi-source brings great challenges to data consistency management, especially for constraints fusion. Besides, for both data from single-sources and multi-sources, they are related. These relationships can be used to strengthen the expression of constraints for semantics, which helps to probe potential data error. In practice, CINDs (conditional inclusion dependencies) and CCFDs (content-related conditional functional dependencies) are two effective techniques respectively for attributes match under heterogeneous schemas and consistency maintenance on content-related data. Based on this, we study how to discover consistency constraints for associated data on heterogeneous schemas. We firstly investigate the three fundamental problems related to CCFDs discovery. And we also illustrate that the implication, satisfiability and validation problems are NP-complete, coNP-complete, PTIME. Aiming at searching for the CCFDs in the space entirely, we present 2-level lattice according to the division between the conditional attribute set and the variable attribute set of CCFDs. After that an incremental method of discovering the fusion constraints over CINDs and CCFDs is proposed, which combines CCFDs on heterogeneous schemas via CINDs. Finally, our method is experimentally verified effectively and scalablely by using two real-life data.
  • Cited by

    Periodical cited type(9)

    1. 李杰,曹建军,王保卫,庄园. 基于图常量条件函数依赖的图修复规则发现. 计算机技术与发展. 2024(04): 7-15 .
    2. 甘润东,王策,李洵. 基于迁移学习的网络传输异构数据一致性校验系统. 自动化技术与应用. 2023(01): 82-85+92 .
    3. 许明宇,王宜怀. 异构物联网中关联数据一致性规则挖掘模型. 计算机仿真. 2023(02): 425-428+442 .
    4. 董琴,杨涛. 基于RBF神经网络的关联数据一致性挖掘仿真. 计算机仿真. 2023(07): 457-461 .
    5. 周春雷,董新微,季良,张璧君,许中平. 基于改进DTW算法的高维时空数据关联挖掘方法. 电子设计工程. 2023(24): 141-144+149 .
    6. 沈毅波. RBF神经网络在关联数据一致性挖掘中的应用. 福建电脑. 2022(08): 5-9 .
    7. 程瑞营,张攀,肖雨,乔宇杰,张安奕. 基于时序数据的云网协同平台人工智能运维体系. 电信科学. 2022(11): 24-35 .
    8. 蒋添任,季于东,侯爱琴. 分布式异构科技资源池数据融合设计. 物联网技术. 2021(06): 62-64 .
    9. 祝红艺,杜香莉,淮孟姣,王博雅. 智库服务中的数据源规范标引合作体系建设研究——以作者与机构名称为例. 当代图书馆. 2021(03): 12-15+35 .

    Other cited types(4)

Catalog

    Article views (1057) PDF downloads (285) Cited by(13)
    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return