Abstract:
With the increase of data intensive application, cluster file systems need to manage PB or even EB scale storage. Limited by the management of data location information, object storage servers have scalable problems at data location, load balance and replica maintenance. To deal with these problems, we present a scalable storage space management to support EB scale storage. Firstly, the presented method organizes object location information into a two-level indexed structure through extendible hashing. Scalable management of object location information can be achieved by this structure. Secondly, this method places object based on the distribution of object location structure. The system can adjust the distribution of data by adjusting the distribution of object location structure with little overhead. Thirdly, the method records the replica location at the granularity of object location structure to reduce maintenance overhead. The evaluation shows the storage space management can provide high efficient data management for massive storage. With load balance mechanism, the I/O throughput of the system can be increased by 10%. Under the concurrent workload, compared with Lustre and DCFS3, the system can achieve a more scalable performance.