Advanced Search
    Gao Tengfei, Liu Yongyan, Tang Yunbo, Zhang Lei, Chen Dan. A Massively Parallel Bayesian Approach to Factorization-Based Analysis of Big Time Series Data[J]. Journal of Computer Research and Development, 2019, 56(7): 1567-1577. DOI: 10.7544/issn1000-1239.2019.20180792
    Citation: Gao Tengfei, Liu Yongyan, Tang Yunbo, Zhang Lei, Chen Dan. A Massively Parallel Bayesian Approach to Factorization-Based Analysis of Big Time Series Data[J]. Journal of Computer Research and Development, 2019, 56(7): 1567-1577. DOI: 10.7544/issn1000-1239.2019.20180792

    A Massively Parallel Bayesian Approach to Factorization-Based Analysis of Big Time Series Data

    • Big time series data record the evolvement of a complex system(s) in large temporal and spatial scales with great details of the interactions amongst different parts of the system. Extracting the latent low-dimensional factors plays a crucial role in examining the overall mechanism of the underlying complex system(s). Research challenges arise with the lack of a priori knowledge, and most conventional factorization methods are not able to adapt to the ultra-high dimension and scales of the big data. Aiming at the grand challenge, this study develops a massively parallel Bayesian approach (G-BF) to factorization-based analysis of tensors formed by massive time series. The approach relies on a Bayesian algorithm to derive the factor matrices in the absence of a priori information. Then the algorithm has been mapped to the compute unified device architecture (CUDA) model to update the factor matrices in a massively parallel manner. The proposed approach is designed to support factorization of tensors of arbitrary dimensions. Experimental results indicated that 1) In comparison with GPU-hierarchical alternative least square (G-HALS), G-BF exhibits much better runtime performance and the superiority becomes more obvious with the increasing data scale; 2)G-BF has excellent scalability in terms of both data volume and rank; 3)Applying G-BF to the existing framework for fusing sub-factors (hierarchical-parallel factor analysis,H-PARAFAC), it becomes possible to factorize a huge tensor (volume up to 10\+11 over two nodes) as a whole with the capability two magnitudes higher than conventional methods.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return