基于隐私保护的分类挖掘
Privacy Preserving Classification Mining
-
摘要: 基于隐私保护的分类挖掘是近年来数据挖掘领域的热点之一,如何对原始真实数据进行变换,然后在变换后的数据集上构造判定树是研究的重点.基于转移概率矩阵提出了一个新颖的基于隐私保护的分类挖掘算法,可以适用于非字符型数据(布尔类型、分类类型和数字类型)和非均匀分布的原始数据,可以变换标签属性.实验表明该算法在变换后的数据集上构造的分类树具有较高的精度.Abstract: Privacy preserving classification mining is one of the fast-growing sub-areas of data mining. How to perturb original data and then build a decision tree based on perturbed data is the key research challenge. By applying transition probability matrix a novel privacy preserving classification mining algorithm is proposed, which suits non-char type data (Boolean, categorical, and numeric type) and non-uniform probability distribution of original data, and can perturb label attribute. Experimental results demonstrate that the decision tree built using this algorithm on perturbed data has a classifying accuracy comparable to that of the decision tree built using un-privacy-preserving algorithm on original data.
-
Keywords:
- data mining /
- classification /
- decision tree /
- privacy preserving /
- transition probability matrix
-
计量
- 文章访问数: 796
- HTML全文浏览量: 0
- PDF下载量: 2562