
Temporal-Frequency Masked Autoencoders for
Time Series Anomaly Detection
Yuchen Fang
1
, Jiandong Xie
2
, Yan Zhao
3,∗
, Lu Chen
4
, Yunjun Gao
4
, Kai Zheng
1,∗
1
University of Electronic Science and Technology of China, China
2
Huawei Cloud Database Innovation Lab, China
3
Aalborg University, Denmark
4
Zhejiang University, China
fyclmiss@gmail.com, xiejiandong@huawei.com, yanz@cs.aau.dk, {luchen,gaoyj}@zju.edu.cn, zhengkai@uestc.edu.cn
Abstract—In the era of observability, massive amounts of time
series data have been collected to monitor the running status
of the target system, where anomaly detection serves to identify
observations that differ significantly from the remaining ones and
is of utmost importance to enable value extraction from such
data. While existing reconstruction-based methods have demon-
strated favorable detection capabilities in the absence of labeled
data, they still encounter issues of training bias on abnormal
times and distribution shifts within time series. To address these
issues, we propose a simple yet effective Temporal-Frequency
Masked AutoEncoder (TFMAE) to detect anomalies in time series
through a contrastive criterion. Specifically, TFMAE uses two
Transformer-based autoencoders that respectively incorporate
a window-based temporal masking strategy and an amplitude-
based frequency masking strategy to learn knowledge without
abnormal bias and reconstruct anomalies by the extracted normal
information. Moreover, the dual autoencoder undergoes training
through a contrastive objective function, which minimizes the
discrepancy of representations from temporal-frequency masked
autoencoders to highlight anomalies, as it helps alleviate the
negative impact of distribution shifts. Finally, to prevent over-
fitting, TFMAE adopts adversarial training during the training
phase. Extensive experiments conducted on seven datasets pro-
vide evidence that our model is able to surpass the state-of-the-art
in terms of anomaly detection accuracy.
Index Terms—time series anomaly detection, temporal-
frequency analysis, masked autoencoder
I. INTRODUCTION
Time series is a sequence of temporally ordered observa-
tions, and the analysis of it has swiftly become a focal task
in academic and industrial research, propelled by advances in
our capability of time series data collection and storage in the
context of sensor networks, cloud computing, and especially
the recent emerging concept of observability [1], [2]. Anomaly
detection is an indispensable task in time series analysis,
determining whether the data conforms to the normal data dis-
tribution, and the non-conforming parts are called anomalies.
Timely alerts for anomalies can empower system maintainers
to proactively conduct maintenance, enabling sustainability
and safety in real applications such as fraud detection [3],
intrusion detection [4], and energy management [5].
∗
Corresponding author: Kai Zheng and Yan Zhao. Kai Zheng is with
Shenzhen Institute for Advanced Study, University of Electronic Science and
Technology of China, Shenzhen, China
However, providing accurate time series anomaly detection
is a non-trivial task because patterns of time series are intricate
and dynamic in various applications, which makes it hard
to seek a general manner for defining anomalies accurately.
Moreover, with the scarcity of labeled data, the progress of
supervised methods for time series anomaly detection is im-
peded. Unsupervised approaches can detect anomalies without
labeled data, which often relies on density-based methods
(leveraging discrepancies between neighbors) and clustering-
based methods (utilizing distances from cluster centers). As the
time series data scale grows larger and deep learning excels
in data analysis, recent endeavors turn to reconstructing time
series with intricate temporal correlations and multivariate
dependencies by using various deep learning models. These
models focus on the discrepancy between the reconstructed
and original time series. Innovations like OmniAno [6], Times-
Net [7], and TranAD [8] introduce the recurrent neural net-
work, convolution neural network, and Transformer network
into the time series anomaly detection, respectively.
Despite recent improvements, existing deep reconstruction-
based methods still face the following challenges.
Challenge I: Abnormal bias. Deep learning models heavily
rely on the learned knowledge from data during the training
phase, yet extracting information from time series proves more
challenging than language and image data due to its intricate
patterns [9], [10]. Therefore, learning a high-quality recon-
struction model becomes particularly hard, especially in the
presence of knowledge-agnostic abnormal bias. As depicted
in the left of Figure 1, TimesNet [7], a typical reconstruction
model, is able to well reconstruct normal series, yet overfits
abnormal observations, leading to performance degradation.
This phenomenon arises from the incorporation of misleading
abnormal bias, which can blend into normal patterns through
the commonly used temporal modeling. Unfortunately, many
existing reconstruction-based methods overlook this abnormal
bias, resulting in suboptimal performance.
Challenge II: Time series distribution shift. The distribu-
tion shift of time series introduces a crucial factor, wherein
patterns learned from training data may become unsuitable
or even incorrect for testing data, which results in erroneous
reconstructed time series. As shown in the right of Figure 1,
the curve of the cumulative score on the testing data goes up
1228
2024 IEEE 40th International Conference on Data Engineering (ICDE)
2375-026X/24/$31.00 ©2024 IEEE
DOI 10.1109/ICDE60146.2024.00099
2024 IEEE 40th International Conference on Data Engineering (ICDE) | 979-8-3503-1715-2/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICDE60146.2024.00099
Authorized licensed use limited to: Huawei Technologies Co Ltd. Downloaded on August 23,2024 at 09:42:52 UTC from IEEE Xplore. Restrictions apply.
评论