暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
阿里云_VLDB-2024-Swat-A System-Wide Approach to Tunable Leakage Mitigation in Encrypted Data Stores.pdf
619
17页
2次
2024-08-13
免费下载
Swat: A System-Wide Approach to Tunable Leakage Mitigation in
Encr ypted Data Stores
Leqian Zheng
City University of Hong Kong
leqian.zheng@my.cityu.edu.hk
Lei Xu
Nanjing University of Science and
Technology
xuleicrypto@gmail.com
Cong Wang
City University of Hong Kong
congwang@cityu.edu.hk
Sheng Wang
Alibaba Group
sh.wang@alibaba-inc.com
Yuke Hu
Zhejiang University
yukehu@zju.edu.cn
Zhan Qin
Zhejiang University
qinzhan@zju.edu.cn
Feifei Li
Alibaba Group
lifeifei@alibaba-inc.com
Kui Ren
Zhejiang University
kuiren@zju.edu.cn
ABSTRACT
Numerous studies have underscored the signicant privacy risks
associated with various leakage patterns in encrypted data stores.
While many solutions have been proposed to mitigate these leak-
ages, they either (1) incur substantial overheads, (2) focus on spe-
cic subsets of leakage patterns, or (3) apply the same security
notion across various workloads, thereby impeding the attainment
of ne-tuned privacy-eciency trade-os. In light of various detri-
mental leakage patterns, this paper starts with an investigation into
which specic leakage patterns require our focus in the contexts of
key-value, range-query, and dynamic workloads, respectively. Sub-
sequently, we introduce new security notions tailored to the specic
privacy requirements of these workloads. Accordingly, we propose
and instantiate Swat, an ecient construction that progressively
enables these workloads, while provably mitigating system-wide
leakage via a suite of algorithms with tunable privacy-eciency
trade-os. We conducted extensive experiments and compiled a de-
tailed result analysis, showing the eciency of our solution. Swat
is about an order of magnitude slower than an encryption-only
data store that reveals various leakage patterns and is two orders of
magnitude faster than a trivial zero-leakage solution. Meanwhile,
the performance of Swat remains highly competitive compared to
other designs that mitigate specic types of leakage.
1 INTRODUCTION
With the advent of cloud computing, many companies and institu-
tions are outsourcing databases and workloads from their private
data centers to the cloud. While this transition oers advantages
such as increased availability, scalability, and cost-eectiveness, it
also exposes users to potential privacy breaches and data abuse.
Consequently, there has been signicant progress in constructing
encrypted databases to prevent adversaries with high privileges
or even physical access to the server from learning sensitive data.
Specically, one line of work [
64
,
65
] leverages specialized crypto-
graphic primitives to perform dierent operations over ciphertexts.
Another prosperous line of work [
3
,
6
,
25
,
67
,
81
,
84
,
91
,
97
] we
will follow utilizes trusted execution environments (TEE), e.g., Intel
SGX and AMD SEV, to process condential data as plaintext in an
isolated and protected approach.
Unfortunately, despite powerful enclaves, recent studies show
that encrypted databases still exhibit diverse leakage patterns, in-
cluding 1) memory access pattern indicates which memory blocks
are accessed, potentially revealing sensitive information like the
access frequency of encrypted records; 2) volume pattern refers to
the size of the query result set, which can be easily obtained by
observing network communication; 3) order pattern refers to the
ordinal relationship between data, which may be inferred from
data storage (e.g., encrypted albeit sequentially stored data) or
memory accesses (e.g., a search over a B+ tree directly reveals
the exact ordinal relationship between nodes along the path); 4)
query correlation pattern reveals how queries are correlated, e.g.,
humans or workloads often generate queries based on previous
ones; and 5) operation timestamp pattern denotes when the en-
crypted database is accessed. These leakage patterns pose risks for
adversaries to recover condential queries or data through leakage
attacks [11, 16, 30, 32–36, 38, 42, 43, 48, 49, 51, 60, 89, 93].
Given its critical importance, many solutions [
2
,
25
,
31
,
45
,
59
,
61
,
68
,
96
,
97
] have thus been proposed to thwart these attacks by
eliminating or mitigating these leakage patterns. From the perspec-
tive of performance, attaining full leakage suppression for even a
single pattern entails substantial overhead, some of which are even
insurmountable. For instance, the well-known logarithmic lower
bounds for ORAM [
27
,
52
]) almost make it theoretically infeasible
to provide obliviousness in large-scale databases. And full query
decorrelation in a known query distribution has been shown in [
31
]
to be as hard as the oine ORAM. Fortunately, it is unnecessary
to oer full leakage suppression in many scenarios since (1) the
adversary’s auxiliary information about the encrypted data and/or
query workloads is practically biased or distorted, and (2) an adver-
sary has to accumulate sucient leakage to launch risky leakage
attacks. For instance, some known-data attacks [
11
,
30
,
42
] assume
explicit knowledge of a (probably large) subset of the encrypted
data or queries, which seems too strong in reality. Kellaris et al
.
[47] show that an adversary needs Ω (𝑛
4
) range queries (exposing
access patterns) to reconstruct the exact value of every record in
an encrypted database of size
𝑛
. And recovering data values with a
arXiv:2306.16851v2 [cs.CR] 15 May 2024
Leqian Zheng, Lei Xu, Cong Wang, Sheng Wang, Yuke Hu, Zhan Qin, Feifei Li, and Kui Ren
relative error
𝜖
needs
Ω(𝜖
4
)
queries [
33
]. Hence, mitigating partial
yet signicant leakage with manageable performance overheads is
destined and admissible. Considering diverse and complex deploy-
ment scenarios, such a trade-o between performance and security
should ideally be tunable.
From the perspective of system leakage, most countermeasures
protect only subsets of leakage patterns. For instance, ObliDB [
25
]
provides a set of customized oblivious operators to conceal mem-
ory access patterns across various query workloads, yet it ignores
hiding the sizes of the intermediate and result tables (i.e., volume
pattern). Most volume-hiding solutions [
2
,
45
,
61
,
96
] ignore the
query equality pattern, which indicates whether two queries re-
peat. Pancake [
31
] mitigates the access pattern via smoothing the
access frequency to entries in an encrypted data store, while the
exposed query correlation pattern has been shown to be vulnerable
to IHOP attack [
60
]. It is hence essential to consider system-wide
leakage while designing encrypted data stores. This problem is
more pronounced in leakage mitigation schemes, where relaxing
protection over existing leakage may inadvertently expose new
and detrimental leakage patterns. A notable example is that relax-
ing oblivious access to frequency-smoothed access introduces the
query correlation pattern [31, 60].
From the perspective of protection applied, existing solutions
typically apply a uniform security notion over all supported work-
loads. This approach simplies the comprehension of the system’s
security but impedes the attainment of ne-tuned and improved
privacy-eciency trade-os. For instance, Adore [
68
] protects a
set of common workloads in relational databases in a dierentially
oblivious approach, which ensures that the access pattern complies
with the dierential privacy notion. However, some workloads,
such as table joins that do not preserve neighboring outputs (as
discussed in §5), may not align well with such a protection mea-
sure. Besides,
E
psolute [
12
] applies the same dierentially private
sanitizer to both point and range queries, while we could leverage
a more ecient scheme [
61
] to hide the volume pattern in point
queries. It is hence valuable to identify the nuanced privacy re-
quirements inherent in each specic workload, with the primary
challenge being that eciently and securely accommodating a new
workload may necessitate modications to the existing ones.
1.1 Our Contributions
Drawing on the above insights, we investigate the system-wide
approach towards tunable leakage mitigation by following recent
enclave-based encrypted data stores [
3
,
25
,
84
,
91
]. To this end,
we present Swat, an ecient encrypted data store that progres-
sively supports key-value, range-query, and dynamic workloads
with tunable system-wide leakage mitigation. We adopt a widely
accepted assumption [
12
,
31
] that posits the existence of a trusted
client proxy. The proxy is responsible for routing client queries to
the enclave deployed on the cloud server via a secure and authenti-
cated channel. We additionally assume a dedicated communication
channel as the billing model based on network bandwidth (rather
than data transferred) usage per month (or year), which we denote
as pay-by-bandwidth, is a standard practice in cloud services
1
.
1
Examples in alphabetical order include Alibaba Cloud, AWS Direct Connect, Azure
ExpressRoute, and Tencent Cloud.
Our work starts from the key-value workload due to its simplicity,
where keys and equal-length values are protected by pseudorandom
functions and authenticated encryption. We mitigate the primary
leakage, i.e., access and query correlation patterns, via frequency
smoothing and partial query decorrelation respectively. We adopt
Pancake for the former one due to its noteworthy gains in balancing
performance and security. Pancake provides provable assurance
of achieving a uniform access frequency for each entry in the key-
value store, irrespective of their original access distribution, as long
as each query is generated independently. However, it falls short in
protecting encrypted data stores when queries are correlated [60].
We hence devise an elegant and almost-for-free security patch,
modeled as
𝜃
-query decorrelation, to mitigate this leakage pattern.
This technique also holds potential for broader applications in other
searchable encryption systems.
We then extend Swat to handle range queries that inherently
expose more leakage, such as order and volume patterns. To sys-
tem widely address these leakage patterns, we introduce a formal
security notion aimed at reducing leakage of range query systems
to that of the previous stage (i.e., key-value stores). Intuitively, no
adversary under this notion can distinguish between a sequence of
data accesses from range queries (sampled from an arbitrary dis-
tribution) and those from uniformly random sampling. To achieve
this, we develop an ecient protocol that sets up the encrypted
data store by partitioning the input dataset into buckets without
revealing their order, and handles queries by accessing the data
store at a xed rate and retrieving a xed number of buckets in
each access. This protocol eectively suppresses order, volume, and
search timestamp patterns, with reasonable monetary cost (thanks
to the pay-by-bandwidth billing model).
Subsequently, we introduce a data-structure dynamization tech-
nique to enable updates over the encrypted data store. In order to
maintain query eciency, the data store requires a well-structured,
albeit hidden from anyone but the trusted proxy (for security),
search index. Unless properly mitigated, the memory access pattern
during index updates will expose sensitive information regarding
the underlying structure of the encrypted data. We capture such
a dominant privacy demand by formulating a dierential obliv-
iousness notion [
17
] since it oers principled privacy-eciency
trade-os. We also evolve a
𝑘
-way dierentially oblivious merge
algorithm from the 2-way one [
17
], which serves as the foundation
of dynamization, to oer better privacy guarantees. Moreover, we
improve the practical performance of the dierentially oblivious
merge algorithm by notably reducing the size of its oblivious buer
without hurting the security guarantees it claims.
We then implement an end-to-end system Swat that realizes
the above functionalities on top of Intel SGX. Swat is about 10
.
6
×
slower than an encryption-only database that exposes all detrimen-
tal leakage patterns and 31
.
6
×
faster than a trivial solution that
eliminates all these patterns. We also compare it to ObliDB [
59
],
the state-of-the-art oblivious database, and
E
psolute [
12
], the state-
of-the-art range-query system mitigating the volume pattern and
eliminating the memory access pattern. The result shows that our
design provides competitive performance while mitigating system-
wide leakage patterns. We also run extensive experiments with
various settings and compiled a detailed result analysis.
We summarize our contributions in this work as follows:
of 17
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜