Core Vector Regression for Attribute Effect Control on Large Scale Dataset
-
-
Abstract
Attribute effect is a kind of phenomenon of data bias caused by sensitive attributes, which widely exists in real world. If not controlled, it will seriously affect the learning performance of regression model. In order to control the attribute effect in nonlinear regression model on large scale biased dataset, a novel fast equal mean-core vector regression (FEM-CVR) is proposed. First, a novel equal mean-support vector regression (EM-SVR) based on margin maximization criterion is proposed by using the constraint condition of equal mean. On this basis, the fact that the optimization problem of EM-SVR is equivalent to a center constrained-minimum enclosing ball (CC-MEB) problem is derived. Then a novel fast minimum enclosing ball based nonlinear regression learning algorithm for attribute effect control on large scale biased dataset, referred to as FEM-CVR, is further proposed by integrating the approximate minimum enclosing ball theory and reducing the original input dataset into the core set. In addition, some fundamental theoretical properties are deeply discussed. Finally, extensive experiments are conducted on synthetic and real datasets, and experimental results show that our FEM-CVR can effectively control attribute effect in nonlinear regression model on large scale biased dataset with good generalization ability, whose upper bound of the time complexity is independent of the size of the dataset, only related to the approximate parameter of the minimum enclosing ball ε.
-
-