Abstract:
Although with multi applications in data mining, fault diagnosis, bioinformatics and other aspects, the popularity of support vector clustering (SVC) algorithm is affected by two shortcomings: expensive computation and poor performance. Focus on such two bottlenecks, a novel algorithm, reduced support vector clustering (RSVC), is proposed. RSVC shares the frame of SVC, but it consists of reduction strategy and the new labeling approach. Reduction strategy is designed according to Schrdinger equation; it extracts those data that are important to model development to form a qualified subset, and optimizes the objective on this subset. The resulting clustering model has little loss in quality while consuming less cost. The new labeling approach is based on geometric properties of feature space of Gauss kernel function; it detects clusters by clustering support vectors and other data respectively in a clear way. The geometric properties are verified to guarantee the validation of the new labeling approach. Theoretical analysis and empirical evidence demonstrate that RSVC overcomes the two bottlenecks well and has advantage over its peers in performance and efficiency. And RSVC also exhibits fine behaviors. It shows that RSVC can work as a friendly clustering method in more applications.