增强型单类支持向量机

冯爱民; 薛  晖; 刘学军; 陈松灿; 杨  明

摘要: 现有基于超平面的单类分类器，包括one-class SVM（OCSVM）和马氏one-class SVM（MOCSVM），由于未考虑数据的结构信息或粒度较粗，寻找的超平面很可能是次优解.为此，增强型单类支持向量机（enhanced OCSVM, EnOCSVM）通过在现有SVM算法中加入数据先验信息以克服其不足.首先，EnOCSVM通过聚类得到数据的内在分布簇，而后将各簇结构信息嵌入到OCSVM框架中，最大化间隔的同时，优化输出空间中各簇数据的紧性.由于保留了SVM框架不变，EnOCSVM仍具备原算法的全部优点，并因结合了数据的簇结构信息而具有更好的推广性.标准数据集上的实验表明，EnOCSVM的推广性能较OCSVM和MOCSVM均有显著提高.

Abstract: One-class-classifier （OCC） aims to distinguish a target class from outliers. Existing OCC algorithms based on hyperplane, such as one-class SVM （OCSVM） and Mahalanobis one-class SVM （MOCSVM）, solve this problem by finding a hyperplane with the maximum distance to the origin. However, since they either neglect the structure of the given data or just takes the structure into account in a relatively coarse granularity, only the suboptimal hyperplane may be abtained. In order to mitigate this problem, a novel OCC named enhanced one-class SVM （EnOCSVM） is proposed. First obtaining the distribution of the target data by the unsupervised methods such as agglomerative hierarchical clustering, and then embedding the cluster information into the original OCSVM framework, EnOCSVM can optimize the tightness of target data and maximizes the margin from the origin simultaneously. In this way, EnOCSVM not only takes much more priori knowledge into account than the above algorithms, but also provides a general method to extend the present SVM algorithm to consider intrinsic structure of the data. Moreover, the optimization of the EnOCSVM can be solved using the standard SVM implementation similar to OCSVM, and all the advantages of SVM are preserved. Experiment results on benchmark data sets show that EnOCSVM really has better generalization than OCSVM and MOCSVM significantly.

增强型单类支持向量机

Enhanced One-Class SVM