Abstract:
Modern blockchain systems face increasing storage pressure due to the continuous growth of on-chain data volume. Most systems adopt LSM-tree-based key-value storage engines as their persistence layer, which maintain intra-level key order through compaction. However, blockchain data is highly heterogeneous, characterized by structurally diverse and unordered keys. This results in frequent compaction operations that cause write amplification and degrade I/O efficiency. To address these issues, we propose AppendChain, a storage optimization framework that reorganizes the write path and key-value layout in the storage engine without altering Ethereum’s execution semantics. AppendChain introduces three techniques: 1) Type Separation: partitions heterogeneous key types into dedicated column families to minimize keyspace overlap; 2) Big Value Separation: redirects large values to external blob storage to avoid redundant rewrites during compaction; 3) Sorted Merkle Patricia Trie (MPT): applies structure-aware and order-aligned key encoding to enhance the locality of state keys. We implement AppendChain on top of Go-Ethereum and integrate it with RocksDB, modifying the storage stack without changing Ethereum’s consensus and provenance logic. Evaluations on mainnet workloads show that, compared with Geth, AppendChain improves storage engine throughput (OPS) by up to 28×, reduces write amplification rate nearly 94.24%, and reduces compaction frequency and time 98.75% and 72.19% respectively, significantly enhancing read/write efficiency under high-throughput transactional workloads. This framework effectively improves the storage performance of blockchain full nodes.