Privacy-Preserving Statistics Protocol for Set-Based Computation
-
Graphical Abstract
-
Abstract
Mining valuable information from high volume of data, which can be used to guide real-world business and management, has been an important demand in the era of big data. In reality, data is usually distributed among different entities. A common way for data analysis is to let a trusted party to perform algorithms on the collected data. However, this trivial approach not only puts data owners privacy at risk, but also may bring with potential data breach issued by some outside attackers. With more and more law and regulation on data security and privacy coming out, there are more requirements for the whole life of data, which includes privacy-preserving data storing, processing and so on. It has been a common sense to leverage privacy-preserving techniques to protect sensitive data while still enabling participants to mine value from the whole dataset contributed from each party. Among those techniques, private set intersection (PSI) attracts more and more attention as it can be applied in many real-world scenarios. However, most PSI protocols only focus on computing the intersection itself, and in many cases the participants may also want to compute a function of the intersection, e.g., the cardinality of intersection or intersection sum, and even more general functions. To this end, a several of protocols are designed in this paper. By these protocols, one can privately compute intersection cardinality or intersection sum without leaking the elements of the intersection. Notably, these protocols, without resorting to homomorphic encryption or general circuit-based technique, can complete needed computation by only leveraging oblivious transfer. Since oblivious transfer can be efficiently extended by current highly efficient oblivious transfer extension protocols, this means these protocols are highly efficient in computation complexity. In addition, communication complexity of the protocols is also optimized by leveraging existing hashing technique. The protocols can be formally proven secure under semi-honest adversary, and complexity analysis is also provided in the paper.
-
-