Abstract:
With the continuous development of the industrial Internet of things (IIoT), an increasing number of devices and sensors are being connected to networks, resulting in a massive influx of time series data. The explosive growth of time series data presents new challenges for database management systems: continuous high-throughput data ingestion, low-latency multidimensional data queries, high-performance time series indexing, and cost-effective data storage. In recent years, time series database technology has become a hot research topic in the field of databases. Some scholars have conducted in-depth research on time series database technology, while specialized time series databases have emerged for managing time series data and have been applied in various fields. These databases have become essential components in IIoT. The existing reviews of time series databases primarily focus on the comparison of functionalities and performance, as well as providing selection recommendations for specific domains. There is a lack of research on key technologies such as data persistence, querying, computation, and indexing in time series stores. Additionally, these reviews appeared earlier and lacked research on modern time series database technologies. We conduct a comprehensive investigation and research analysis of both academic research on time series data storage and industrial time series databases. We take a deep dive into four key technologies in time series databases: 1) time series index optimization techniques; 2) in-memory data organization techniques; 3) high-throughput data ingestion and low-latency data query techniques; 4) cost-effective storage techniques for massive historical data. We also analyze and summarize existing TSDB benchmarks. Finally, we present future development directions for the key technologies in time series databases.