暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
ICDE2024_Temporal-Frequency_Masked_Autoencoders_for_Time_Series_Anomaly_Detection_华为云.pdf
280
14页
3次
2024-08-28
免费下载
Temporal-Frequency Masked Autoencoders for
Time Series Anomaly Detection
Yuchen Fang
1
, Jiandong Xie
2
, Yan Zhao
3,
, Lu Chen
4
, Yunjun Gao
4
, Kai Zheng
1,
1
University of Electronic Science and Technology of China, China
2
Huawei Cloud Database Innovation Lab, China
3
Aalborg University, Denmark
4
Zhejiang University, China
fyclmiss@gmail.com, xiejiandong@huawei.com, yanz@cs.aau.dk, {luchen,gaoyj}@zju.edu.cn, zhengkai@uestc.edu.cn
Abstract—In the era of observability, massive amounts of time
series data have been collected to monitor the running status
of the target system, where anomaly detection serves to identify
observations that differ significantly from the remaining ones and
is of utmost importance to enable value extraction from such
data. While existing reconstruction-based methods have demon-
strated favorable detection capabilities in the absence of labeled
data, they still encounter issues of training bias on abnormal
times and distribution shifts within time series. To address these
issues, we propose a simple yet effective Temporal-Frequency
Masked AutoEncoder (TFMAE) to detect anomalies in time series
through a contrastive criterion. Specifically, TFMAE uses two
Transformer-based autoencoders that respectively incorporate
a window-based temporal masking strategy and an amplitude-
based frequency masking strategy to learn knowledge without
abnormal bias and reconstruct anomalies by the extracted normal
information. Moreover, the dual autoencoder undergoes training
through a contrastive objective function, which minimizes the
discrepancy of representations from temporal-frequency masked
autoencoders to highlight anomalies, as it helps alleviate the
negative impact of distribution shifts. Finally, to prevent over-
fitting, TFMAE adopts adversarial training during the training
phase. Extensive experiments conducted on seven datasets pro-
vide evidence that our model is able to surpass the state-of-the-art
in terms of anomaly detection accuracy.
Index Terms—time series anomaly detection, temporal-
frequency analysis, masked autoencoder
I. INTRODUCTION
Time series is a sequence of temporally ordered observa-
tions, and the analysis of it has swiftly become a focal task
in academic and industrial research, propelled by advances in
our capability of time series data collection and storage in the
context of sensor networks, cloud computing, and especially
the recent emerging concept of observability [1], [2]. Anomaly
detection is an indispensable task in time series analysis,
determining whether the data conforms to the normal data dis-
tribution, and the non-conforming parts are called anomalies.
Timely alerts for anomalies can empower system maintainers
to proactively conduct maintenance, enabling sustainability
and safety in real applications such as fraud detection [3],
intrusion detection [4], and energy management [5].
Corresponding author: Kai Zheng and Yan Zhao. Kai Zheng is with
Shenzhen Institute for Advanced Study, University of Electronic Science and
Technology of China, Shenzhen, China
However, providing accurate time series anomaly detection
is a non-trivial task because patterns of time series are intricate
and dynamic in various applications, which makes it hard
to seek a general manner for defining anomalies accurately.
Moreover, with the scarcity of labeled data, the progress of
supervised methods for time series anomaly detection is im-
peded. Unsupervised approaches can detect anomalies without
labeled data, which often relies on density-based methods
(leveraging discrepancies between neighbors) and clustering-
based methods (utilizing distances from cluster centers). As the
time series data scale grows larger and deep learning excels
in data analysis, recent endeavors turn to reconstructing time
series with intricate temporal correlations and multivariate
dependencies by using various deep learning models. These
models focus on the discrepancy between the reconstructed
and original time series. Innovations like OmniAno [6], Times-
Net [7], and TranAD [8] introduce the recurrent neural net-
work, convolution neural network, and Transformer network
into the time series anomaly detection, respectively.
Despite recent improvements, existing deep reconstruction-
based methods still face the following challenges.
Challenge I: Abnormal bias. Deep learning models heavily
rely on the learned knowledge from data during the training
phase, yet extracting information from time series proves more
challenging than language and image data due to its intricate
patterns [9], [10]. Therefore, learning a high-quality recon-
struction model becomes particularly hard, especially in the
presence of knowledge-agnostic abnormal bias. As depicted
in the left of Figure 1, TimesNet [7], a typical reconstruction
model, is able to well reconstruct normal series, yet overfits
abnormal observations, leading to performance degradation.
This phenomenon arises from the incorporation of misleading
abnormal bias, which can blend into normal patterns through
the commonly used temporal modeling. Unfortunately, many
existing reconstruction-based methods overlook this abnormal
bias, resulting in suboptimal performance.
Challenge II: Time series distribution shift. The distribu-
tion shift of time series introduces a crucial factor, wherein
patterns learned from training data may become unsuitable
or even incorrect for testing data, which results in erroneous
reconstructed time series. As shown in the right of Figure 1,
the curve of the cumulative score on the testing data goes up
1228
2024 IEEE 40th International Conference on Data Engineering (ICDE)
2375-026X/24/$31.00 ©2024 IEEE
DOI 10.1109/ICDE60146.2024.00099
2024 IEEE 40th International Conference on Data Engineering (ICDE) | 979-8-3503-1715-2/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICDE60146.2024.00099
Authorized licensed use limited to: Huawei Technologies Co Ltd. Downloaded on August 23,2024 at 09:42:52 UTC from IEEE Xplore. Restrictions apply.
0 20 40 60 80 100
Time
-4
-2
0
Value
Input
Reconstructed
Score
0.0 0.2 0.4 0.6 0.8 1.0
Anomaly Score
0.0
0.2
0.4
0.6
0.8
1.0
Density
Test
Val
Threshold
Fig. 1: Left: TimesNet [7] conducts anomaly detection on the
synthetic NIPS-TS-Global dataset. Right: Cumulative distribu-
tion function (CDF) of the anomaly scores on the real-world
SMAP validation and test sets for TimesNet.
faster than those on the validation data, which is attributed
to the distribution shift of time series and contributes to
poor generalization on the threshold. Despite there are two
types of existing techniques that can be incorporated into
reconstruction-based methods to mitigate distribution shifts:
normalization and decomposition. The static statistics-based
normalization [11], [12] and pre-defined parameters-based
decomposition [9], [13]–[15] exhibit limitations in general-
ization. Moreover, methods relying on learned statistics [16]
and parameters [17], [18] remain susceptible to the influence
of distribution shifts.
This work aims to address the above challenges by design-
ing a pioneering model, called Temporal-Frequency Masked
AutoEncoder (TFMAE), which is trained through an ad-
versarial contrastive objective function. For abnormal bias,
recognizing the importance of purifying time series, we imple-
ment a window-based temporal masking strategy to eliminate
potential observation anomalies (e.g., global and contextual
observation anomalies) with a large coefficient of variation,
and an amplitude-based frequency masking strategy to elim-
inate potential pattern anomalies (e.g., trend and seasonal
anomalies) with a small amplitude. These strategies are ben-
eficial for learning without abnormal bias and reconstructing
anomalies with the learned normal knowledge. To tackle the
distribution shift issue, we initially devise two Transformer-
based autoencoders to generate distinct representations of our
temporal and frequency masking-based time series. Then a
novel contrastive objective function is introduced to minimize
the discrepancy between these representations. Finally, the
contrastive discrepancy replaces the conventional reconstruc-
tion error for anomaly detection, as the discrepancy between
anomalies and their corresponding normal-recovered views
exceeds that of normal representations. Breaking free from
the reconstruction paradigm, the contrastive criterion leverages
the fact that the similarity of different views is distribution-
agnostic [19]. Besides, to avoid over-fitting in minimizing dis-
crepancy, adversarial training is integrated into our TFMAE.
Our contributions can be summarized as follows:
To eliminate potential abnormal observations and patterns
before modeling, we present a window-based tempo-
ral masking strategy and an amplitude-based frequency
masking strategy. Therefore, autoencoders with purified
inputs are not misled by observation and pattern anoma-
lies, i.e., TFMAE is an abnormal bias-resistant model.
To the best of our knowledge, this work is the first study
that replaces the reconstruction error with the temporal
and frequency masking-based contrastive criterion for
time series anomaly detection, which is a distribution shift
unaffected model.
We conduct extensive experiments on seven public real
time series datasets, and the results offer insight into the
effectiveness and efficiency of TFMAE.
Section II surveys the related work. The problem statement
and the system overview are introduced in Section III. We then
present TFMAE in Section IV, followed by the experimental
results in Section V. Section VI concludes this paper.
II. R
ELATED WORK
A. Time Series Anomaly Detection
Numerous studies have been conducted on time series
anomaly detection. Based on the manners of detecting anoma-
lies, we can divide the methods into five types: density-based
methods, clustering-based methods, label-based methods, re-
construction-based methods, and contrastive-based methods.
Density-based methods mainly focus on the discrepancy
between observations and their neighbors. The density-based
local outlier factor (LOF) [20] and connectivity-based con-
nectivity outlier factor (COF) [21] are two typical traditional
density-based methods. As deep representational learning
gained traction, advanced models such as DAGMM [22] and
MPPCACD [23] learn low-dimensional representations of time
series and utilize the Gaussian Mixture Model to derive their
density, which achieves highly accurate detection results.
Clustering-based methods leverage distances between ob-
servations and the cluster center to discern anomalies. Tak-
ing clustering into consideration, classic techniques such as
support vector data description (SVDD) [24] and one-class
support vector machine (OC-SVM) [25] search the hyper-
sphere and hyperplane in the kernel space. Subsequently,
DSVDD [26] and THOC [27] are proposed to seek clusters in
the deep latent and hierarchical space for anomaly detection.
Label-based methods engage in supervised classification
during the training phase. Microsoft [28] pioneered the use of
human-generated labels to train a CNN model for anomaly de-
tection. Extending this approach, RobustTAD [29] introduces
the assignment of label and value weights to labels and crucial
points to improve detection performance.
Reconstruction-based methods have been a cornerstone of
research, particularly in the absence of labeled training data.
These models are trained in an unsupervised manner, detecting
anomalies by discerning discrepancies between original and
reconstructed time series. In the earlier years, reconstruc-
tion relied on statistical methods like ARIMA [30]. The
advent of deep learning gave rise to methods such as Omni-
Ano [6], HIFI [31], Interfusion [32], TFAD [13], VQRAE [33],
RDAE [34], and TimesNet [7]. They reconstruct time series
1229
Authorized licensed use limited to: Huawei Technologies Co Ltd. Downloaded on August 23,2024 at 09:42:52 UTC from IEEE Xplore. Restrictions apply.
of 14
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜