高级检索

    基于等维度独立多流形的DC-ISOMAP算法

    Manifold Learning Algorithm DC-ISOMAP of Data Lying on the Well-Separated Multi-Manifold with Same Intrinsic Dimension

    • 摘要: 流形学习已经成为机器学习与数据挖掘领域中一个重要的研究课题.目前的流形学习算法都假设所研究的高维数据存在于同一个流形上,并不能支持或者应用于大量存在的采样于多流形上的高维数据.针对等维度的独立多流形DC-ISOMAP算法,首先通过从采样密集点开始扩展切空间的方法将多流形准确分解为单个流形,并逐个计算其低维嵌入,然后基于各子流形间的内部位置关系将其低维嵌入组合起来,得到最终的嵌入结果.实验结果表明,该算法在人造数据和实际的人脸图像数据上都能有效地计算出高维数据的低维嵌入结果.

       

      Abstract: Manifold learning has become a hot issue in the field of machine learning and data mining. Its algorithms often assume that the data resides on a single manifold. And both the theories and algorithms are lacking when the data is supported on a mixture of manifolds. A new method, which is called DC-ISOMAP method, is proposed for the nonlinear dimensionality reduction of data lying on the separated multi-manifold with same intrinsic dimension. Although several algorithms based on spectral clustering or manifold clustering can separate sub-manifolds and get their low-dimensional embeddings, the algorithms do not think about the topological structure of multi-manifolds and must know the number of the clustered sub-manifolds. DC-ISOMAP firstly decomposes a given data set into several sub-manifolds by propagating the tangent subspace of the point with maximum sampling density to a separate sub-manifold, and then the low-dimensional embeddings of each sub-manifold is independently calculated. Finally the embeddings of all sub-manifolds are composed into their proper positions and orientations based on their inter-connections. Experimental results on synthetic data as well as real world images demonstrate that our approaches can construct an accurate low-dimensional representation of the data in an efficient manner.

       

    /

    返回文章
    返回