高级检索
    杨阳, 武昱, 汪云海, 曹轶. 大规模结构网格数据的相关性统计建模轻量化方法[J]. 计算机研究与发展, 2023, 60(3): 676-689. DOI: 10.7544/issn1000-1239.202111208
    引用本文: 杨阳, 武昱, 汪云海, 曹轶. 大规模结构网格数据的相关性统计建模轻量化方法[J]. 计算机研究与发展, 2023, 60(3): 676-689. DOI: 10.7544/issn1000-1239.202111208
    Yang Yang, Wu Yu, Wang Yunhai, Cao Yi. Correlation Statistical Modeling Reduction Method for Large-Scale Structural Grid Data[J]. Journal of Computer Research and Development, 2023, 60(3): 676-689. DOI: 10.7544/issn1000-1239.202111208
    Citation: Yang Yang, Wu Yu, Wang Yunhai, Cao Yi. Correlation Statistical Modeling Reduction Method for Large-Scale Structural Grid Data[J]. Journal of Computer Research and Development, 2023, 60(3): 676-689. DOI: 10.7544/issn1000-1239.202111208

    大规模结构网格数据的相关性统计建模轻量化方法

    Correlation Statistical Modeling Reduction Method for Large-Scale Structural Grid Data

    • 摘要: 高置信度的数据可视分析对于大规模数值模拟至关重要,但是当前高性能计算机的存储瓶颈导致可视分析应用获取原始高分辨率网格数据越来越困难. 基于统计建模的方法能够极大降低高分辨数据存储成本,但是重建数据的不确定性高. 为此,提出了一种大规模结构网格数据的相关性统计建模轻量化方法,用于对并行数值模拟生成的大规模多块体数据进行高效分析与可视化. 该方法的技术核心是通过数据块间的统计相关性,指导邻接数据块的统计建模,从而有效地保留数据统计特征,且不需要对不同并行计算节点中的数据块进行合并与重新分块. 通过耦合数据块的数值分布信息、空间分布信息和相关性信息,该方法可以更精确地重建原始数据,降低可视化的不确定性. 实验测试采用了最大10亿网格规模的5组科学数据,定量分析结果显示,在相同数据压缩比下,该方法相比现有方法可将数据重建精度最大提升近2个数量级.

       

      Abstract: Data visual analysis is essential for large-scale numerical simulations. The storage bottleneck of high-performance computers makes it challenging to analyze and visualize data with original high-resolution. The method based on statistical modeling can significantly reduce the data storage cost, with the reconstruction uncertainty being high. Therefore, we propose a large-scale data reduction method for efficient analysis and visualizing large-scale multi-block volume data generated by massively parallel scientific simulations. The technical core of this method is to guide the statistical modeling of adjacent data blocks through the statistical representation of correlation between data blocks. By doing so, our method efficiently preserves the statistical data properties without merging data blocks stored in different parallel computing nodes and repartitioning them according to the homogeneity requirements of the visualization. Compared with exsiting methods, the original data can be reconstructed more accurately by coupling numerical distribution information, spatial distribution information, and correlation information, further reducing the visual uncertainty. The experimental tests use five sets of scientific data with the largest scale of one billion grids. The quantitative analysis results show that our method improves the data reconstruction accuracy by up to two orders of magnitude at the same data compression ratio compared with the current state-of-the-art methods.

       

    /

    返回文章
    返回