Abstract:
As a nonlinear extension of the classical MDS algorithm, ISOMAP is suitable to visualize nonlinear low-dimensional manifolds embedded in high-dimensional spaces. However, ISOMAP requires that the data belong to a single well-sampled cluster. When the data consists of multiple clusters, long geodesic distances may be badly approximated by the corresponding shortest path lengths, which makes the classical MDS algorithm used in ISOMAP unsuitable. Besides, the success of ISOMAP depends greatly on being able to choose a suitable neighborhood size; however, it's difficult to choose a suitable neighborhood size efficiently. When the neighborhood size is unsuitable, shortcut edges are introduced into the neighborhood graph so that the neighborhood graph cannot represent the right neighborhood structure of the data. To solve the above problems, a new variant of ISOMAP, i.e., GISOMAP, is presented, which uses a special case of MDS to reduce the influence of long geodesic distances and shortcut edges on distance preservation to a certain extent. Consequently, GISOMAP can visualize the data which consists of multiple clusters better than ISOMAP, and can also be less sensitive to the neighborhood size than ISOMAP, which makes GISOMAP be applied more easily than ISOMAP. Finally, the feasibility of GISOMAP can be verified by experimental results well.