Abstract:
Modern blockchain systems face increasing storage pressure due to the continuous growth of on-chain data volume. Most systems adopt LSM-tree-based key-value storage engines as their persistence layer, which maintains intra-level key order through compaction. However, blockchain data are highly heterogeneous, characterized by structurally diverse and unordered keys. This results in frequent compaction operations that cause write amplification and degrade I/O efficiency. To address these issues, we propose AppendChain, a storage optimization framework that reorganizes the write path and key-value layout in the storage engine without altering Ethereum’s execution semantics. AppendChain introduces three techniques: 1) Type Separation: partitioning heterogeneous key types into dedicated column families to minimize keyspace overlap; 2) Big Value Separation: redirecting large values to external blob storage to avoid redundant rewrites during compaction; 3) Sorted Merkle Patricia Trie (MPT): appling structure-aware and order-aligned key encoding to enhance the locality of state keys. We implement AppendChain on top of Go-Ethereum (Geth) and integrate it with RocksDB, modifying the storage stack without changing Ethereum’s consensus and provenance logic. Evaluations on mainnet workloads show that, compared with Geth, AppendChain improves storage engine throughput (OPS) by up to 28 times, reduces write amplification rate nearly 94.24%, and reduces compaction frequency and time 98.75% and 72.19% respectively, significantly enhancing read/write efficiency under high-throughput transactional workloads. This framework effectively improves the storage performance of blockchain full nodes.