Abstract:
Feature selection is an effective technique of dimensionality reduction in the field of machine learning. In the era of big data, data security has become an issue of great concern nowadays, and how to perform the feature selection task under the premise of privacy protection is a challenging scientific problem that needs to be solved urgently. Rough hypercuboids is an uncertainty approximation computational model combining rough set theory and hypercuboid learning, which provides an efficient feature selection method for numerical approximate classification problems by introducing supervised information granulation technique and multiple feature evaluation criteria. In this paper, we propose a novel multi-party federated feature selection algorithm under privacy protection, based on the rough hypercuboids model and particle swarm optimization algorithm. Firstly, a centralized (client/server) federated feature selection architecture for multi-party participation is established. Based on the architecture, the rough hypercuboid model and the particle swarm optimization algorithm are used to search the optimal feature subset on the client, and a novel global feature subset evaluation strategy for multiple participants is proposed on the server. Then, the ability of the proposed algorithm to select features in collaboration with multiple participants is improved by designing a particle initialization strategy in a federated environment. Finally, experimental results on the twelve UCI benchmark datasets show that compared with the other six traditional feature selection algorithms, the subset of features selected by the proposed algorithm has a higher classification performance on each participant under the premise of satisfying the data privacy protection.