Advanced Search
    Yang Xuemei, Dong Yisheng, Xu Hongbing, Liu Xuejun, Qian Jiangbo, Wang Yongli. Online Correlation Analysis for Multiple Dimensions Data Streams[J]. Journal of Computer Research and Development, 2006, 43(10): 1744-1750.
    Citation: Yang Xuemei, Dong Yisheng, Xu Hongbing, Liu Xuejun, Qian Jiangbo, Wang Yongli. Online Correlation Analysis for Multiple Dimensions Data Streams[J]. Journal of Computer Research and Development, 2006, 43(10): 1744-1750.

    Online Correlation Analysis for Multiple Dimensions Data Streams

    • Studied in this paper is the problem of identifying correlations between two multiple-dimensions data streams under constrained resources. A novel online canonical correlation analysis (CCA) algorithm based on approximate technique, called QuickCCA, is proposed. To solve bottleneck of CCA's performance, QuickCCA uses a column-sampling with non-equal probability to compress the numbers of tuples and construct synopsis matrix first. And based on the synopsis matrix, the most k principal correlation coefficients between evolving multiple-dimensions data streams are computed rapidly. Theoretic analysis and experiments indicate that QuickCCA can accurately identify correlations between two multiple-dimensions data streams in synchronic sliding windows model. Compared with the existing correlation analysis algorithm for data streams, the QuickCCA algorithm reduces complexity of computation efficiently and trades accuracy with performance. It can be presented as a generic tool for a multitude of applications on data stream mining problems.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return