Abstract:
SSDs are popular in large-scale storage systems to accelerate the system performance because a single SSD cannot satisfy the performance, capacity and reliability requirements of data-intensive computing applications. Thus applying RAID algorithms to SSDs is necessary and promising to build high performance, high capacity and highly reliable SSD-based storage systems. However, garbage collection operations in SSDs have a significant impact on the SSD performance, thus leading to the performance variability in redundant array of independent SSDs (RAIS). To address this problem, GC-RAIS exploits the high random-read performance characteristics of SSDs and the hot-spare SSD in RAIS to alleviate the negative impact of GC operations on the RAIS performance. When an SSD is in the GC state, the incoming read requests to this SSD are serviced by reconstructing the read data from the other SSDs in the same stripe (read reconstruction), while the incoming write data is temporally stored on the hot-spare SSD and the corresponding parity is concurrently updated (write redirection). After the GC process completes, the redirected write data is reclaimed to its correct SSD. The original DiskSim and the MSR SSD simulator are extended to implement the proposed GC-RAIS and the GC-RAIS performance is evaluated with the HPC-like and enterprise realistic workloads. The simulation results show that GC-RAIS significantly outperforms the local garbage collection (LGC) and the global garbage collection (GGC) by 55% and 25% on average, respectively. Moreover, GC-RAIS reduces the performance variability for a wide variety of HPC-like and enterprise realistic workloads.