高级检索

    一种时序数据模式演化的跟踪与查询方法

    Tracking and Querying over Timeseries Data with Schema Evolution

    • 摘要: 在物联网与大数据应用蓬勃发展的背景下,各类感知设备产生海量的时序数据,设备管理软件版本的快速迭代导致时序数据的模式演化问题日益凸显.模式演化要求对数据模式进行版本管理,使数据进行模式变更时不产生信息损失,且支持对数据跨模式版本进行读写操作.结合流行的时序数据库管理系统,调研总结了各类数据库管理系统对模式演化的支持情况,对时序数据及其模式进行了形式化表述,对其模式演化的过程进行了分析,设计了一种面向时序数据的模式演化跟踪及查询方法,形式化表达了模式跟踪及跨模式版本查询的整体框架与关键步骤,并在时序数据库Apache IoTDB上进行了实现与测试.最后,分析了实现系统的性能,并展望了未来研究方向.

       

      Abstract: In the context of the Internet of things and big data, vast amount of sensors generate massive time series data on daily basis. The fast iterations of software releases lead to frequent changes to the schema of these time series, which makes the management of schema evolution of time series increasingly prominent. Schema evolution requires the management of each version of data schema, so that there is no information loss during schema modification, and data can be accessed across multiple schema versions. Existing timeseries databases management system have limited support for schema evolution, while schema evolution may occur frequently under this circumstance. State-of-art research and technology for schema evolution mainly focus on relational database, struggling with complicated integrity constraint which is more flexible within timeseries database. This paper compares various databases with regard to schema evolution, provide a formal definition to the time series and its schemas, and analyzes the process of schema evolution. This paper designs a data-centric schema evolution tracing and querying system, discusses the key problems of schema tracking and cross schema version query in detail, and implements and tests it on the timeseries database Apache IoTDB. Finally, the performance of the system is evaluated, and the future research is discussed.

       

    /

    返回文章
    返回