Advanced Search
    Xiong Ping, Zhu Tianqing. A Data Anonymization Approach Based on Impurity Gain and Hierarchical Clustering[J]. Journal of Computer Research and Development, 2012, 49(7): 1545-1552.
    Citation: Xiong Ping, Zhu Tianqing. A Data Anonymization Approach Based on Impurity Gain and Hierarchical Clustering[J]. Journal of Computer Research and Development, 2012, 49(7): 1545-1552.

    A Data Anonymization Approach Based on Impurity Gain and Hierarchical Clustering

    • Data anonymization is one of the important solutions to preserve privacy in data publishing. The basic concept of data anonymization and the application models are introduced, and the requirements that an anonymized dataset should meet are discussed. To resist the background knowledge attack, a new data anonymization approach based on impurity gain and hierarchical clustering is brought out. The impurity of a cluster is used to measure the randomicity of sensitive attributes, and the clusters' combination process is controlled by the restrictions that the information loss caused by generalization should be minimized and the impurity gain should be maximized. With the method, the anonymization results of a dataset can meet the requirements of k-anonymity model and l-diversity model, meanwhile, the information loss is minimized and the values of the sensitive attributes in each cluster has a uniform distribution. An evaluation method is provided in the experiment section, which compares anonymized dataset with the original one to evaluate the quality by calculating the average information loss and impurity. The experimental results validate the availability of the method.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return