
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software, [doi: 10.13328/j.cnki.jos.006044] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
面向频繁项集挖掘的本地差分隐私事务数据收集方法
欧阳佳
1
,
印
鉴
2
,
肖政宏
1
,
赵慧民
1
,
刘少鹏
1
,
梁
鹏
1
,
肖茵茵
1
1
(广东技术师范大学 计算机科学学院, 广东 广州 510665)
2
(中山大学 数据科学与计算机学院, 广东 广州 510275)
通讯作者:肖政宏, E-mail: huasxzh@126.com
摘 要: 事务数据常见于各种应用场景中,如购物记录、页面浏览历史等.为提供更好的服务,服务提供商收集用户
数据并进行分析,但收集事务数据会泄露用户的隐私信息.为解决上述问题,本文基于压缩的本地差分隐私模型,提
出一种事务数据收集方法.首先,定义一种新的候选项集分值函数;其次,基于该函数将候选项集的样本空间划分为
多个子空间;第三,随机选择其中一个子空间,基于该子空间随机生成事务数据并发送给不可信的数据收集者;最后
考虑到隐私参数的设置问题,基于最大后验置信度攻击模型设计启发式隐私参数设置策略.理论分析表明该方法能
同时保护事务数据的长度与内容,满足压缩的本地差分隐私要求.实验表明,与目前最优的工作相比,本文收集的数
据具有更高的效用性,隐私参数设置更具有语义性.
关键词: 隐私保护;数据收集;事务数据;本地差分隐私;隐私参数
中图法分类号: TP311
中文引用格式: 欧阳佳,印鉴,肖政宏,赵慧民,刘少鹏,梁鹏,肖茵茵等.面向频繁项集挖掘的本地差分隐私事务数据收集方法.
软件学报. http://www.jos.org.cn/1000-9825/6044.htm
英文引用格式: Ouyang J, Yin J, Xiao ZH, Zhao HM, Liu SP, LIANG P, XIAO YY. Transaction Data Collection for It
emset Mining under Local Differential Privacy. Ruan Jian Xue Bao/Journal of Software, (in Chinese). http://www.jos.org.
cn/1000-9825/6044.htm
Transaction Data Collection for Itemset Mining under Local Differential Privacy
OUYANG Jia
1
, YIN Jian
1
, XIAO Zheng-Hong
1
, ZHAO Hui-Min
1
, LIU Shao-Peng
1
, LIANG Peng
1
, XIAO Yin-Yin
1
1
(College of Computer Science, Guangdong Polytechnic Normal University, Guangzhou 510665, China)
2
(School of Data and Computer Science, SUN YAT-SEN University, Guangzhou 510275, China)
Abstract: Transaction data is commonly in various application scenarios, such as shopping records, page browsing history, etc., service
providers collect and analyze transaction data for providing better services. However, collecting transaction data will disclose privacy
information. To solve the problem, this paper proposes a transaction data collection mechanism based on Condensed Local Differential
Privacy (CLDP). Firstly, we define a new score function of the candidate set. Secondly, we separate the output domain of the candidate set
into several subspaces according to the function. Thirdly, the client select one subspace randomly, and generate transaction data randomly
based on the subspace, then, send it to the untrusted data collector. Finally, considering the difficulty for setting the privacy parameter, we
design the heuristic privacy parameter setting strategy, based on the maximum posterior confidence threat model (MPC). The theoretical
analysis shows that this method can protect the length and content of transaction data at the same time and satisfies
α
-CLDP. The
基金项目: 国家自然科学基金(61702119,U1711262,U1501252,U1711261);广州市科技计划项目(201804010236,201607010152);
广东省基础与应用基础研究基金(2019A1515012048).广东省教育厅创新团队项目(2017KCXTD021)
Foundation item: National Natural Science Foundation of China (61702119, U1711262,U1501252, U1711261); Science and
Technology Program of Guangzhou (201804010236, 201607010152); Guangdong Basic and Applied Basic Research Foundation
(2019A1515012048). The Innovation Team Project of the Education Department of Guangdong Province(2017KCXTD021)
收稿时间:
2019-11-06; 修改时间: 2020-01-30, 2020-03-09; 采用时间: 2020-03-20; jos 在线出版时间: 2021-05-20
评论