ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2015, Vol. 52 ›› Issue (7): 1463-1476.doi: 10.7544/issn1000-1239.2015.20140236

    Next Articles

Mixture of Probabilistic Canonical Correlation Analysis

Zhang Bo1,2,3, Hao Jie4, Ma Gang1,2, Yue Jinpeng1,2, Zhang Jianhua1,2, Shi Zhongzhi1   

  1. 1(Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190);2(University of Chinese Academy of Sciences, Beijing 100049);3(School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, Jiangsu 221116) ;4(School of Medicine Information, Xuzhou Medical College, Xuzhou, Jiangsu 221004)
  • Online:2015-07-01

Abstract: Canonical correlation analysis (CCA) is a statistical analysis tool, which is used to analyze the correlation between two sets of random variables. A critical limitation of CCA is that it can only detect linear correlation between the two domains that is globally valid throughout both data sets. It is not enough to reveal the large amount of non-linear correlation phenomenon in the real world. To address this limitation, there are three main ways: kernel mapping, neural network and the method of localization. In this paper, a mixture model of local linear probabilistic canonical correlation analysis (PCCA) called MixPCCA is constructed based on the idea of localization, and a two-stage EM algorithm is proposed to estimate the model parameters. How to determine the number of local linear models is a fundamental issue to be addressed. We solve this problem by the framework of cluster ensembles. In addition, the theoretical framework of MixPCCA model applied in pattern recognition is put forward. The results on both USPS and MNIST handwritten image datasets demonstrate that the proposed MixPCCA model not only provides a solution to capture the complex global non-linear correlation, but also has the ability of detecting correlation which only exist in the local area, which traditional CCA or PCCA fails to discover.

Key words: canonical correlation analysis, probabilistic canonical correlation analysis, mixture probabilistic model, cluster ensembles, pattern recognition

CLC Number: