高级检索

    数据模式感知的低成本云日志存储系统

    Data-Pattern-Aware Low-Cost Cloud Log Storage Systems

    • 摘要: 公共云的日志服务能够全面提升研发、运维、运营和安全保障的能力,而云日志具有数据规模庞大、留存时间长、写入速度快、有用信息密度低、访问延迟要求高等特点. 为了节省存储成本,需要满足3个要求:1)以较高压缩密度保存此类数据(压得狠);2)以较高的压缩速度实现数据写入(压得快);3)以低延迟对压缩数据进行快速检索(查得快). 同时实现这三者是充满挑战的,需结合具体应用场景进行定制化设计. 通过总结云日志中的典型数据模式,给出一种低成本云日志存储范式——数据模式感知的低成本云日志存储系统,从压缩率、压缩速度和检索延迟等3个方面对若干低成本云日志存储方法进行对比测试. 最后,结合相关领域研究提出3点经验和思考,供未来的研究工作参考.

       

      Abstract: Cloud-native system log service can fully boost the researching, maintaining, operating and security ability of the public cloud. Could log data are typically large in scale, requiring long preserving time, high ingestion speed and low access latency, while the information density is low. To save the storage cost, it is required to compactly compress the logs, compress the logs in a high speed, and retrieve target data with low latencies. However, it is challenging to achieve these three goals at the same time, and a customized solution should be designed for this scenario. By summarizing the typical data patterns in the cloud logs, including the static patterns, namely the formatted output statements in the source code and the runtime patterns, which are generated during the execution of the programs, a low-cost storage schema is proposed for the public cloud logs. By evaluating several low-cost storage methods of the cloud logs, their effectiveness with respect to the compression ratio, compression speed and query latency are shown. Besides, several experiences for designing a low-cost storage system for the cloud logs are proposed in expectation of inspiring relevant research in the future.

       

    /

    返回文章
    返回