暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
Apache IoTDB A Time Series Database for IoT Applications.pdf
139
27页
1次
2023-06-25
免费下载
195
Apache IoTDB: A Time Series Database for IoT Applications
CHEN WANG, Tsinghua University, China
JIALIN QIAO and XIANGDONG HUANG, Timecho Ltd, China
SHAOXU SONG, Tsinghua University, China
HAONAN HOU, Timecho Ltd, China
TIAN JIANG, LEI RUI, JIANMIN WANG, and JIAGUANG SUN, Tsinghua University, China
A typical industrial scenario encounters thousands of devices with millions of sensors, consistently generating
billions of data points. It poses new requirements of time series data management, not well addressed in
existing solutions, including (1) device-dened ever-evolving schema, (2) mostly periodical data collection,
(3) strongly correlated series, (4) variously delayed data arrival, and (5) highly concurrent data ingestion. In
this paper, we present a time series database management system, Apache IoTDB. It consists of (i) a time
series native le format, TsFile, with specially designed data encoding, and (ii) an IoTDB engine for eciently
handling delayed data arrivals and processing queries. The system achieves a throughput of 10 million inserted
values per second. Queries such as 1-day data selection of 0.1 million points and 3-year data aggregation over
10 million points can be processed in 100 ms. Comparisons with InuxDB, TimescaleDB, KairosDB, Parquet
and ORC over real world data loads demonstrate the superiority of IoTDB and TsFile.
CCS Concepts: Information systems Data management systems.
Additional Key Words and Phrases: time series, data model, database engine, distributed
ACM Reference Format:
Chen Wang, Jialin Qiao, Xiangdong Huang, Shaoxu Song, Haonan Hou, Tian Jiang, Lei Rui, Jianmin Wang,
and Jiaguang Sun. 2023. Apache IoTDB: A Time Series Database for IoT Applications. Proc. ACM Manag. Data
1, 2, Article 195 (June 2023), 27 pages. https://doi.org/10.1145/3589775
1 INTRODUCTION
In the Internet of Things (IoT), a huge amount of time series is are generated by various devices with
many sensors attached. The data need to be managed not only in the cloud for intelligent analysis
but also at the edge for real-time control. For example, more than 20,000 excavators are managed
by one of our industrial partners, a maintenance service provider of heavy industry machines, each
of which has hundreds of sensors, e.g., monitoring engine rotation speed. As illustrated in Figure 1,
the data are rst packed in devices, and then sent to the server via 5G mobile network. In the server,
the data are written to a time series database for OLTP queries. Finally, data scientists may load
data from the database to a big data platform for complex analysis and forecasting, i.e., OLAP tasks.
Shaoxu Song (https://sxsong.github.io/) is the corresponding author.
Authors’ addresses: Chen Wang, wang_chen@tsinghua.edu.cn, Tsinghua University, Beijing, China; Jialin Qiao, jialin.qiao@
timecho.com; Xiangdong Huang, hxd@timecho.com, Timecho Ltd, Beijing, China; Shaoxu Song, sxsong@tsinghua.edu.cn,
Tsinghua University, Beijing, China; Haonan Hou, haonan.hou@timecho.com, Timecho Ltd, Beijing, China; Tian Jiang,
jiangtia18@mails.tsinghua.edu.cn; Lei Rui, rl18@mails.tsinghua.edu.cn; Jianmin Wang, jimwang@tsinghua.edu.cn; Jiaguang
Sun, sunjg@tsinghua.edu.cn, Tsinghua University, Beijing, China.
This work is licensed under a Creative Commons Attribution International 4.0 License.
© 2023 Copyright held by the owner/author(s).
2836-6573/2023/6-ART195
https://doi.org/10.1145/3589775
Proc. ACM Manag. Data, Vol. 1, No. 2, Article 195. Publication date: June 2023.
195:2 Chen Wang et al.
!"#$%&'()
*+, &%'
!"#-'./'.
*+, &%' *+, &%' *+, &%' *+, &% '*+, &%'
-01.2 314550
6!"$
78#,&%'
-9(:
;<=#>4?'#$5@0A)'.
;:=#$%5A4#$%A+)'.
;1=#>(4#!'/&:'
,&%'
-9(:
>*B
Fig. 1. Data management in IoT scenarios
!"#$
!%#&
'(&)
*+,-+.-/0
'(&1
2302/,4
,356+738
#9:#&:;&
836+<384+,,-*+6
/=-..38
#9:#>:;&
?#2
#9:#$:;& #9:#?:;& #9:#@:;& #9:#A:;&
#9:#>:)#
Fig. 2. Time series management issues in IoT scenarios
1.1 Motivation
The process in Figure 1 poses new requirements to time series database management systems. (1)
In the end device, such as the aforesaid excavator, a lightweight database or a compact le format
is needed to save space and network bandwidth. (2) In the edge server, a full-function database
collects, stores and queries the massive data of devices, capable of handling delayed arrivals. (3)
In the cloud, a database cluster with complete historical data persistence connects directly to big
data analysis systems, such as Spark and Hadoop, and enables OLAP queries. In addition to the
large scale issues, millions of series (columns) and billions of points (rows), we highlight below the
unique and urgent features in the IoT scenarios.
1.1.1 Device-defined Ever-evolving Schema. Unlike the traditional databases with pre-dened
schema, the schema of time series data in the IoT scenario is dened by sensors in the devices.
During the device maintenance or upgrade, sensors are frequently removed, replaced or augmented,
leading to changed schema. For instance, as illustrated in Figure 2, sensor FC32 for monitoring
fuel consumption is replaced by FC3X, at time 09:06:13. We need a data model that is suciently
exible to capture such ever-evolving schema.
1.1.2 Mostly Periodical Data Collection. Machine generated sensor data are often collected period-
ically with a pre-set frequency. While the time series is expected with a regular time interval, there
may be small variations due to data bus congestion or network delay. Even worse, those values not
changed with the previous may be omitted to save energy. For example, in Figure 2, ES05 is mostly
collected in every 60 seconds, but with a small delay from time 09:04:13 to 09:04:20 and an omitted
data at time 09:07:13. Data encoding should be able to handle such variations for ecient storage.
1.1.3 Strongly Correlated Series. It is also worth noting that multiple sensors, e.g., in the same
module of a device, may collect data at the same time. In addition to the same timestamps, their
values may also be correlated. For instance, in Figure 2, the fuel consumption (FC32/FC3X) value
Proc. ACM Manag. Data, Vol. 1, No. 2, Article 195. Publication date: June 2023.
of 27
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜