Second, some existing approaches rely on domain-specic assump-
tions, such as the neighbor similarity [
11
,
46
] and the contextual
consistency [
59
], thus are dicult to generalize to various scenarios.
For instance, Franceschi et al
. [11]
and Tonekaboni et al
. [46]
as-
sume that subsequences distant in time should be dissimilar, which
can be easily violated in periodic time series [43].
To tackle the issues mentioned above, we explore the time-
series-specic representation encoder without strong assumptions
for URL. In particular, we consider the encoder based on a non-
parametric time series analysis concept named shapelet [
58
], i.e.
salient subsequence which is tailored to extract time series features
from only important time windows to avoid the noises outside. The
main reason is that the shapelet-based representation has shown su-
perior performance in specic tasks such as classication [
25
,
33
,
54
]
and clustering [
61
]. Besides, compared to the feature extracted from
other neural networks such as CNN, the shapelet-based feature can
be more intuitive to understand [
58
]. However, it has never been
explored in the recently rising topic of URL for general-purpose
representation. To ll this gap, we take the rst step and propose
to learn shapelet-based encoder employing contrastive learning,a
popular paradigm that has shown success in URL [8, 10, 59, 64].
We highlight three challenges in learning high-quality and general-
purpose shapelet-based representation. The rst is how to design a
shapelet-based encoder to capture diverse temporal patterns of
various time ranges, considering that it is originally proposed to
represent only a single shape feature, and exhaustive search or prior
knowledge is needed to determine the encoding scale [
5
,
25
,
61
].
The second is how to design a URL objective to learn general infor-
mation for downstream tasks through this shapelet-based encoder,
which has never been studied. Last, while contrastive learning
leverages the representation similarity of the augmentations of one
sample [
8
] to learn the encoder, it remains an open problem to
properly augment the time series to keep the similarity [46, 59].
To cope with these challenges, we propose a novel unsuper-
vised MTS representation learning framework named Contrastive
Shapelet Learning (CSL). Specically, we design a unied archi-
tecture that uses multiple shapelets with various (dis)similarity
measures and lengths to jointly encode a sample, such that to cap-
ture diverse temporal patterns from short to long term. As shapelets
of dierent lengths can separately embed one sample into dierent
representation spaces that are complementary with each other, we
propose a multi-grained contrasting objective to simultaneously
consider the joint embedding and the representations at each time
scale. In parallel, we design a multi-scale alignment loss to encour-
age the representations of dierent scales to achieve consensus.
The basic idea is to automatically capture the varying semantics
by leveraging the intra-scale and inter-scale dependencies of the
shapelet-based embedding. Besides, we develop an augmentation
library using diverse types of data augmentation methods to further
improve the representation quality. To the best of our knowledge,
CSL is the rst general-purpose URL framework based on shapelets.
The main contributions are summarized as follows:
•
This paper studies how to improve the URL performance using
time-series-specic shapelet-based representation, which has
achieved success in specic tasks but has never been explored
for the general-purpose URL.
•
A novel framework is proposed that adopts contrastive learn-
ing to learn shapelet-based representations. A unied shapelet-
based encoder architecture and a learning objective with multi-
grained contrasting and multi-scale alignment are particularly
designed to capture diverse patterns in various time ranges. A
library containing various types of data augmentation meth-
ods is constructed to improve the representation quality.
•
Experiments on tens of real-world datasets from various do-
mains show that i) our learned representations are general
to many downstream tasks, such as classication, clustering,
and anomaly detection; ii) the proposed method outperforms
existing URL competitors and can be comparable to (even bet-
ter than) tailored techniques for classication and clustering.
Additionally, we study the eectiveness of the key compo-
nents proposed in CSL and the model sensitivity to the key
parameters, demonstrate the superiority of CSL against the
fully-supervised competitors on partially labeled data, and ex-
plain the shapelets learned by CSL. We also study our method
in long time series representation and assess its running time.
2 RELATED WORK
There are two lines of research closely related to this paper:
Unsupervised MTS representation learning. Unlike in domains
such as CV [
8
,
27
,
29
,
55
] and NLP [
12
,
65
], the study of URL in time
series is still in its infancy.
Inspired by word representation [
36
], Franceschi et al
. [11]
adapts
the triplet loss to time series to achieve URL. Similarly, Zerveas
et al
. [60]
explores the utility of transformer [
51
] for URL due to
the success of transformer in modeling natural language. Oord
et al
. [39]
proposes to learn the representation by predicting the
future in latent space. Eldele et al
. [10]
extends this idea by con-
ducting both temporal and contextual contrasting to improve the
representation quality. Instead of using prediction, Yue et al
. [59]
combines timestamp-level contrasting with contextual contrasting
to achieve hierarchical representation. Tonekaboni et al
. [46]
as-
sumes consistency between overlapping temporal neighborhoods
to model dynamic latent states, while Yang and Hong
[56]
utilizes
the consistency between temporal and spectral domains to enrich
the representation. Although these methods have achieved improve-
ments in representation quality, they still have limitations such as
the lack of intuitions in encoder design and the dependency on
specic assumptions, as discussed in Section 1.
Time series shapelet. The concept of shapelet is rst proposed
by Ye and Keogh
[58]
for supervised time series classication tasks.
It focuses on extracting features in a notable time range to reduce
the interference of noise, which is prevalent in time series.
In the early studies, shapelets are selected by enumerating subse-
quences of the training time series [
5
,
17
,
38
,
58
], which suers from
non-optimal representation and high computational overhead [
13
].
To address these problems, a shapelet learning method is rst pro-
posed by Grabocka et al
. [13]
, which directly learns the optimal
shapelets through a supervised objective. After this study, many
approaches [
30
,
33
,
34
,
54
] have been proposed to improve the ef-
fectiveness and eciency for classication. Except for supervised
classication task, some works [
47
,
61
,
62
] employ shapelets for
time series clustering and also show competitive performance.
文档被以下合辑收录
评论