
195
Apache IoTDB: A Time Series Database for IoT Applications
CHEN WANG, Tsinghua University, China
JIALIN QIAO and XIANGDONG HUANG, Timecho Ltd, China
SHAOXU SONG, Tsinghua University, China
HAONAN HOU, Timecho Ltd, China
TIAN JIANG, LEI RUI, JIANMIN WANG, and JIAGUANG SUN, Tsinghua University, China
A typical industrial scenario encounters thousands of devices with millions of sensors, consistently generating
billions of data points. It poses new requirements of time series data management, not well addressed in
existing solutions, including (1) device-dened ever-evolving schema, (2) mostly periodical data collection,
(3) strongly correlated series, (4) variously delayed data arrival, and (5) highly concurrent data ingestion. In
this paper, we present a time series database management system, Apache IoTDB. It consists of (i) a time
series native le format, TsFile, with specially designed data encoding, and (ii) an IoTDB engine for eciently
handling delayed data arrivals and processing queries. The system achieves a throughput of 10 million inserted
values per second. Queries such as 1-day data selection of 0.1 million points and 3-year data aggregation over
10 million points can be processed in 100 ms. Comparisons with InuxDB, TimescaleDB, KairosDB, Parquet
and ORC over real world data loads demonstrate the superiority of IoTDB and TsFile.
CCS Concepts: • Information systems → Data management systems.
Additional Key Words and Phrases: time series, data model, database engine, distributed
ACM Reference Format:
Chen Wang, Jialin Qiao, Xiangdong Huang, Shaoxu Song, Haonan Hou, Tian Jiang, Lei Rui, Jianmin Wang,
and Jiaguang Sun. 2023. Apache IoTDB: A Time Series Database for IoT Applications. Proc. ACM Manag. Data
1, 2, Article 195 (June 2023), 27 pages. https://doi.org/10.1145/3589775
1 INTRODUCTION
In the Internet of Things (IoT), a huge amount of time series is are generated by various devices with
many sensors attached. The data need to be managed not only in the cloud for intelligent analysis
but also at the edge for real-time control. For example, more than 20,000 excavators are managed
by one of our industrial partners, a maintenance service provider of heavy industry machines, each
of which has hundreds of sensors, e.g., monitoring engine rotation speed. As illustrated in Figure 1,
the data are rst packed in devices, and then sent to the server via 5G mobile network. In the server,
the data are written to a time series database for OLTP queries. Finally, data scientists may load
data from the database to a big data platform for complex analysis and forecasting, i.e., OLAP tasks.
Shaoxu Song (https://sxsong.github.io/) is the corresponding author.
Authors’ addresses: Chen Wang, wang_chen@tsinghua.edu.cn, Tsinghua University, Beijing, China; Jialin Qiao, jialin.qiao@
timecho.com; Xiangdong Huang, hxd@timecho.com, Timecho Ltd, Beijing, China; Shaoxu Song, sxsong@tsinghua.edu.cn,
Tsinghua University, Beijing, China; Haonan Hou, haonan.hou@timecho.com, Timecho Ltd, Beijing, China; Tian Jiang,
jiangtia18@mails.tsinghua.edu.cn; Lei Rui, rl18@mails.tsinghua.edu.cn; Jianmin Wang, jimwang@tsinghua.edu.cn; Jiaguang
Sun, sunjg@tsinghua.edu.cn, Tsinghua University, Beijing, China.
This work is licensed under a Creative Commons Attribution International 4.0 License.
© 2023 Copyright held by the owner/author(s).
2836-6573/2023/6-ART195
https://doi.org/10.1145/3589775
Proc. ACM Manag. Data, Vol. 1, No. 2, Article 195. Publication date: June 2023.
评论