BeeZip2: A Domain-Specific Accelerator for High Performance Lossless Data Compression

Gao Ruihao; Shi Shunchen; Li Xueqi; Tan Guangming

doi:10.7544/issn1000-1239.202550017

Gao Ruihao, Shi Shunchen, Li Xueqi, Tan Guangming. BeeZip2: A Domain-Specific Accelerator for High Performance Lossless Data Compression[J]. Journal of Computer Research and Development, 2025, 62(6): 1562-1580. DOI: 10.7544/issn1000-1239.202550017

Citation:

BeeZip2: A Domain-Specific Accelerator for High Performance Lossless Data Compression

Graphical Abstract

Graphical Abstract

Abstract

Abstract

High-performance and intelligent computing applications require massive data. The transfer and storage of data pose challenges for computer systems. Data compression algorithms reduce storage and transmission costs, making themselves crucial for improving system efficiency. Domain-specific hardware design is an effective way to accelerate data compression algorithms. The emerging data compression software, Zstandard, significantly enhances throughput and compression ratio. Zstandard is based on LZ77 compression algorithm, but it has a larger sliding window that increases on-chip storage overhead. Also, it has complex data dependencies and control flow. These features limit the effect of hardware acceleration. To improve the throughput while achieving a similar compression ratio for data compression accelerator in the context of large sliding windows, we propose a cross-layer optimization approach based on algorithm-architecture co-design to develop a novel data compression acceleration architecture, BeeZip2. First, we introduce the MetaHistory match method into the design of the large sliding window parallel Hash table, offering regular parallelism and addressing control flow data dependency. Then, we propose the shared match PE architecture, distributing the large sliding window across multiple processing units to share on-chip memory and reduce overhead. In addition, the Lazy match strategy and corresponding architecture help to fully leverage the resources for a higher compression ratio. Experimental results show BeeZip2 achieves 13.13 GB/s throughput while maintaining the software compression ratio. Compared with single-core and 36-core CPU software implementations, throughput increases by 29.2 times and 3.35 times, respectively. Compared with the baseline accelerator BeeZip, BeeZip2 achieves a 1.26 times throughput improvement and a 2.02 times throughput-per-area enhancement under the constraint of maintaining a higher compression ratio than its software counterpart.

FullText(HTML)

References (42)

Supplements (1)

Cited By

Turn off MathJax

Article Contents

BeeZip2: A Domain-Specific Accelerator for High Performance Lossless Data Compression

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content