Abstract:
Manifold learning has become a hot issue in the field of machine learning and data mining. Its algorithms often assume that the data resides on a single manifold. And both the theories and algorithms are lacking when the data is supported on a mixture of manifolds. A new method, which is called DC-ISOMAP method, is proposed for the nonlinear dimensionality reduction of data lying on the separated multi-manifold with same intrinsic dimension. Although several algorithms based on spectral clustering or manifold clustering can separate sub-manifolds and get their low-dimensional embeddings, the algorithms do not think about the topological structure of multi-manifolds and must know the number of the clustered sub-manifolds. DC-ISOMAP firstly decomposes a given data set into several sub-manifolds by propagating the tangent subspace of the point with maximum sampling density to a separate sub-manifold, and then the low-dimensional embeddings of each sub-manifold is independently calculated. Finally the embeddings of all sub-manifolds are composed into their proper positions and orientations based on their inter-connections. Experimental results on synthetic data as well as real world images demonstrate that our approaches can construct an accurate low-dimensional representation of the data in an efficient manner.