Abstract:
Attribute network embedding aims to map nodes and link relationships in a network into a latent low-dimensional space, while preserving the intrinsic essence of node attribute and network topology. Heterogeneous attribute network contains the multiple-typed nodes and link relationships, which provide the rich auxiliary information and bring the new challenges for the network embedding. A novel model named HANEP (heterogeneous attribute network embedding based on the PPMI) is proposed for mapping multiple-typed nodes and link relationship in a heterogeneous attribute network into a latent low-dimensional space, while preserving the attribute features of nodes as well as complex, diverse and rich semantic information of different-typed heterogeneous links. Specifically, HANEP first transforms attribute features into an attribute graph and extracts network topology graphs based on the different meta-paths. Next, it constructs the probabilistic co-occurrence (PCO) matrixes with respect to nodes attribute and multiple topology graphs by the random surfing respectively, calculates the positive point-wise mutual information (PPMI), and then learns representations of nodes by the multiple auto-encoders. Meta-paths can capture the link relationships between the multiple types of nodes in a heterogeneous network, the attribute graph clearly describes the non-linear manifolds structure of node attributes, pairwise constraint is helpful to integrate the consistency and complementary relationships, and PPMI representations can capture the high-order proximity and potentially nonlinear relationships of attribute and topology. Experimental results on three datasets verify the effectiveness of the HANEP.