
前言
今天聊聊数据脱敏吧。
单纯是想玩一下 percona 的 data_masking 插件和 PG 的postgresql_anonymizer 扩展,顺便记录一下哈哈。
个人公众号就是口气大,没办法个人的就是自由啦。
什么是数据脱敏
percona 的说法是:
数据脱敏通过阻止未经授权的用户访问真实数据来保护敏感信息。此过程会创建更改后的数据版本,用于特定用途,例如演示文稿、销售演示或软件测试。脱敏后的数据保留与原始数据相同的格式,但包含无法逆转以显示真实信息的更改值。通过使数据对外部人员毫无价值,脱敏可帮助组织降低数据泄露或滥用的风险。公司可以安全地在各种场景中使用脱敏后的数据,而不会向未经授权的各方泄露机密详细信息。
简单说就是,以各种方式保护原表数据不被泄露,即使泄露也是无关紧要的部分内容,使用比如:数据加密、加盐、替换等方法是遮蔽这些内部数据。
PS:(悄悄话)这篇不讲加密 AES_DECRYPT() 之类的加密算法,太麻烦了,又要证书又要密钥,我懒得搞哈哈。什么时候想写了再写!!一切均是兴趣使然,强迫不得也。
MySQL 数据脱敏
mysql 的脱敏插件是企业版才有,爱白嫖的我们当然就选择,大慈善家 percona 了。
percona-mysql 有一个 data_masking 插件,我们去顺过来装在社区版玩,因为是同源,所以基本上就是兼容的。
我的mysql是 8.0.32 glibc 2.17
root@localhost [(none)] 22:56:35 > select version();+-----------+| version() |+-----------+| 8.0.32 |+-----------+1 row in set (0.02 sec)[root@localhost ~]# ldd --versionldd (GNU libc) 2.17Copyright (C) 2012 Free Software Foundation, Inc.This is free software; see the source for copying conditions. There is NOwarranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.由 Roland McGrath 和 Ulrich Drepper 编写。
所以去拿 percona 对应版本的:https://www.percona.com/downloads#percona-server-mysql
我选的是这个:Percona-Server-8.0.32-24-Linux.x86_64.glibc2.17-minimal.tar.gz

安装 data_masking 插件
解压一下,在这个位置下:Percona-Server-8.0.32-24-Linux.x86_64.glibc2.17-minimal/lib/plugin
可以找到 data_masking 插件
[root@localhost plugin]# ll data*-rw-r--r--. 1 root root 1374 3月 10 2023 data_masking.ini-rwxr-xr-x. 1 root root 179880 3月 10 2023 data_masking.so
有一个 .ini 一个 data_masking.so,data_masking.so 就是插件。
我们要把它复制到我们的 plugin_dir 下
root@localhost [(none)] 22:53:58 > show variables like 'plugin%';+---------------+---------------------------------------+| Variable_name | Value |+---------------+---------------------------------------+| plugin_dir | application/mysql_8_3306/lib/plugin/ |+---------------+---------------------------------------+1 row in set (0.00 sec)
复制过去然后授权
cp data_masking.so application/mysql_8_3306/lib/plugin/chown mysql.mysql application/mysql_8_3306/lib/plugin/data_masking.so
检查一下
[root@localhost plugin]# ll application/mysql_8_3306/lib/plugin/data_masking.so-rwxr-xr-x. 1 mysql mysql 179880 10月 29 22:54 application/mysql_8_3306/lib/plugin/data_masking.so
好了,现在启动!!!
安装 data_masking
root@localhost [(none)] 22:56:32 > INSTALL PLUGIN data_masking SONAME 'data_masking.so';Query OK, 0 rows affected (0.24 sec)
OK 顺手牵羊成功
使用 data_masking 插件
看看 percona 官方写了有哪些函数:https://docs.percona.com/percona-server/8.0/data-masking-plugin-functions.html#data-masking-plugin-functions
| 函数名 | 描述 |
|---|---|
| mask_inner(string, margin1, margin2 [, character]) | 返回仅对字符串内部部分进行脱敏的结果。可以指定不同的掩码字符。 |
| mask_outer(string, margin1, margin2 [, character]) | 掩码字符串的外部部分。内部部分未被掩码。可以指定不同的掩码字符。 |
| mask_pan(string) | 通过用“X”替换字符串(最后四个字符除外)来掩码主账户号码(PAN)。PAN 字符串必须是 15 或 16 个字符。 |
| mask_pan_relaxed(string) | 返回前六个数字和最后四个数字。其余部分用“X”替换。 |
| mask_ssn(string) | 返回一个字符串,仅显示最后四个数字。其余部分用“X”替换。 |
| gen_range(lower, upper) | 根据选定范围生成随机数,并支持负数。 |
| gen_rnd_email() | 生成一个随机电子邮件地址。域名为 example.com。 |
| gen_rnd_pan([size in integer]) | 生成一个随机主账户号码。此函数仅应用于测试目的。 |
具体参考:https://docs.percona.com/percona-server/8.0/data-masking-plugin-functions.html#data-masking-plugin-functions
翻译了一些我觉得挺有意思的函数描述。
让我们试一下这个。
测试数据
create table mask(id int primary key,name varchar(64),phone varchar(11),idcard varchar(64));insert into mask values(1,'红太狼','12345678911','1234567891012134');insert into mask values(2,'灰太狼','12345678911','1234567891012134');
红太狼和灰太狼用一个手机号码和银行卡没问题吧!!没毛病吧哈哈
看一下小夫妻的个人信息。
root@localhost [ddl] 23:16:57 > select * from mask;+----+-----------+-------------+------------------+| id | name | phone | idcard |+----+-----------+-------------+------------------+| 1 | 红太狼 | 12345678911 | 1234567891012134 || 2 | 灰太狼 | 12345678911 | 1234567891012134 |+----+-----------+-------------+------------------+2 rows in set (0.00 sec)
试一下 mask_inner(),string 是指定要脱敏的字符串,margin1/margin2 是左右分别不掩码的字符串个数,character 掩码的内容。
root@localhost [ddl] 23:22:37 > select mask_inner('abcdef',1,1,'*'),mask_inner('abcdef',2,1,'='),mask_inner('abcdef',1,2,'-');+------------------------------+------------------------------+------------------------------+| mask_inner('abcdef',1,1,'*') | mask_inner('abcdef',2,1,'=') | mask_inner('abcdef',1,2,'-') |+------------------------------+------------------------------+------------------------------+| a****f | ab===f | a---ef |+------------------------------+------------------------------+------------------------------+1 row in set (0.01 sec)
还挺牛逼的哈。
放到表里面看看
root@localhost [ddl] 23:24:52 > select id,mask_inner(name,1,0,'*'),mask_inner(phone,3,4,'#'),mask_inner(idcard,4,4,'#') from mask;+----+--------------------------+---------------------------+----------------------------+| id | mask_inner(name,1,0,'*') | mask_inner(phone,3,4,'#') | mask_inner(idcard,4,4,'#') |+----+--------------------------+---------------------------+----------------------------+| 1 | ?** | 123####8911 | 1234########2134 || 2 | ?** | 123####8911 | 1234########2134 |+----+--------------------------+---------------------------+----------------------------+2 rows in set (0.00 sec)
好像没法处理中文诶。
那就换一种叭,用 concat 来实现。好了完美。
root@localhost [ddl] 23:35:57 > select id,concat(left(name,1),repeat('*',char_length(name)-1)),mask_inner(phone,3,4,'#'),mask_inner(idcard,4,4,'#') from mask;+----+------------------------------------------------------+---------------------------+----------------------------+| id | concat(left(name,1),repeat('*',char_length(name)-1)) | mask_inner(phone,3,4,'#') | mask_inner(idcard,4,4,'#') |+----+------------------------------------------------------+---------------------------+----------------------------+| 1 | 红** | 123####8911 | 1234########2134 || 2 | 灰** | 123####8911 | 1234########2134 |+----+------------------------------------------------------+---------------------------+----------------------------+2 rows in set (0.00 sec)
你猜这样会走索引吗?
create index mask_id_name_idcard on mask(id,name,idcard);
肯定会走啦。这里数据太少了,就懒得测啦。
其他的函数其实都差不多,我举例一下,就不放表里面了
mask_outer 是对两边掩码
root@localhost [ddl] 23:46:17 > select mask_outer('abcde',1,1,'*');+-----------------------------+| mask_outer('abcde',1,1,'*') |+-----------------------------+| *bcd* |+-----------------------------+1 row in set (0.00 sec)
mask_pan 是对 15~16 位的字符掩码,超过或者少了都不行
root@localhost [ddl] 23:46:45 > select mask_pan('12345678978997555'),mask_pan('123456789789975'),mask_pan('adwewsdawertyui');+-------------------------------+-----------------------------+-----------------------------+| mask_pan('12345678978997555') | mask_pan('123456789789975') | mask_pan('adwewsdawertyui') |+-------------------------------+-----------------------------+-----------------------------+| 12345678978997555 | XXXXXXXXXXX9975 | XXXXXXXXXXXtyui |+-------------------------------+-----------------------------+-----------------------------+1 row in set (0.00 sec)
mask_pan_relaxed 是对 15~16 位的字符掩码,超过或者少了都不行,然后留下前 6 个和后 4 个
root@localhost [ddl] 23:50:57 > select mask_pan_relaxed('awdesdesxdersat'),mask_pan_relaxed('4664646478674669');+-------------------------------------+--------------------------------------+| mask_pan_relaxed('awdesdesxdersat') | mask_pan_relaxed('4664646478674669') |+-------------------------------------+--------------------------------------+| awdesdXXXXXrsat | 466464XXXXXX4669 |+-------------------------------------+--------------------------------------+1 row in set (0.00 sec)
mask_ssn 是 9 位这种格式:654-66-2262,然后留下后 4 位
root@localhost [ddl] 23:53:50 > select mask_ssn('654-66-2262');+-------------------------+| mask_ssn('654-66-2262') |+-------------------------+| XXX-XX-2262 |+-------------------------+1 row in set (0.00 sec)
还有 3 个生成的一起看看吧,
gen_range 生成范围内的随机数
gen_rnd_email 生成一个 @example.com 结尾的 email
gen_rnd_pan 生成 15~16 位
root@localhost [ddl] 23:57:47 > select gen_range(-10,-1),gen_rnd_email(),gen_rnd_pan();+-------------------+----------------------+-----------------+| gen_range(-10,-1) | gen_rnd_email() | gen_rnd_pan() |+-------------------+----------------------+-----------------+| -4 | Xr603HSB@example.com | 341391117060328 |+-------------------+----------------------+-----------------+1 row in set (0.00 sec)
还有一个数据字典的,要用到 secure_file_priv 变量定义数据字典的物理文件。
我这里没有定义,它是只读变量
root@localhost [ddl] 00:01:42 > show variables like 'secure_file_priv';+------------------+-------+| Variable_name | Value |+------------------+-------+| secure_file_priv | NULL |+------------------+-------+1 row in set (0.00 sec)root@localhost [ddl] 00:01:48 > set global secure_file_priv='/log';ERROR 1238 (HY000): Variable 'secure_file_priv' is a read only variable
写配置文件
[mysqld]secure_file_priv='/log'
然后重启
root@localhost [(none)] 00:04:47 > show variables like 'secure_file_priv';+------------------+-------+| Variable_name | Value |+------------------+-------+| secure_file_priv | log/ |+------------------+-------+1 row in set (0.02 sec)
定义好了,我们在这里面写给物理文件
echo -e 'xiyy\nmeiyy\nfeiyy' > log/dictionary.confchown mysql.mysql log/dictionary.confecho -e 'lanyy\nliyy\nmanyy' > log/dictionary2.confchown mysql.mysql log/dictionary2.conf
然后加载一下
root@localhost [(none)] 00:04:50 > select gen_dictionary_load('/log/dictionary.conf','d1');+--------------------------------------------------+| gen_dictionary_load('/log/dictionary.conf','d1') |+--------------------------------------------------+| Dictionary load success |+--------------------------------------------------+1 row in set (0.00 sec)root@localhost [(none)] 00:09:23 > select gen_dictionary_load('/log/dictionary2.conf','d2');+---------------------------------------------------+| gen_dictionary_load('/log/dictionary2.conf','d2') |+---------------------------------------------------+| Dictionary load success |+---------------------------------------------------+1 row in set (0.00 sec)
OK,成功了
然后就可以随机查数据字典了。
root@localhost [(none)] 00:11:09 > select gen_dictionary('d1'),gen_dictionary('d2');+----------------------+----------------------+| gen_dictionary('d1') | gen_dictionary('d2') |+----------------------+----------------------+| feiyy | liyy |+----------------------+----------------------+1 row in set (0.00 sec)root@localhost [(none)] 00:11:56 > select gen_dictionary('d1'),gen_dictionary('d2');+----------------------+----------------------+| gen_dictionary('d1') | gen_dictionary('d2') |+----------------------+----------------------+| feiyy | lanyy |+----------------------+----------------------+1 row in set (0.00 sec)
gen_blacklist 这个是对等替换,如果 第一个参数是 d1 的就替换 d2 的同一个位置的值。
root@localhost [(none)] 00:14:28 > select gen_blacklist('xiyy','d1','d2');+---------------------------------+| gen_blacklist('xiyy','d1','d2') |+---------------------------------+| lanyy |+---------------------------------+1 row in set (0.00 sec)root@localhost [(none)] 00:14:37 > select gen_blacklist('meiyy','d1','d2');+----------------------------------+| gen_blacklist('meiyy','d1','d2') |+----------------------------------+| liyy |+----------------------------------+1 row in set (0.00 sec)
[root@localhost ~]# cat log/dictionary.conf xiyymeiyyfeiyy[root@localhost ~]# cat log/dictionary2.conflanyyliyymanyy
如果是 xiyy 就替换了第一个位置的 lanyy。
如果是 meiyy 就替换了 第二个位置的 liyy。
好了,data_masking 插件讲完了。
实践脱敏
真正的脱敏应该是用视图做,然后取消别的用户对表的查看权限。
create view v_mack as select id,concat(left(name,1),repeat('*',char_length(name)-1)),mask_inner(phone,3,4,'#'),mask_inner(idcard,4,4,'#') from mask;
看一下这个视图:
root@localhost [ddl] 00:25:00 > select * from v_mack;+----+------------------------------------------------------+---------------------------+----------------------------+| id | concat(left(name,1),repeat('*',char_length(name)-1)) | mask_inner(phone,3,4,'#') | mask_inner(idcard,4,4,'#') |+----+------------------------------------------------------+---------------------------+----------------------------+| 1 | 红** | 123####8911 | 1234########2134 || 2 | 灰** | 123####8911 | 1234########2134 |+----+------------------------------------------------------+---------------------------+----------------------------+2 rows in set (0.01 sec)
因为用了函数,所以这个视图是没法更新的奥。
root@localhost [ddl] 00:25:08 > insert into v_mack values(3,'小灰灰','12345678911','1234567891012134');ERROR 1348 (HY000): Column 'concat(left(name,1),repeat('*',char_length(name)-1))' is not updatable
这时候就只需要给视图的 select 权限就可以了,我们试一下
create user mask identified by '123456';grant select on ddl.v_mack to mask;
查询一下就可以看到,已经脱敏了。
mask@192.168.58.60 [(none)] 00:28:25 > select * from ddl.mask;ERROR 1142 (42000): SELECT command denied to user 'mask'@'192.168.58.135' for table 'mask'mask@192.168.58.60 [(none)] 00:28:36 > select * from ddl.v_mack;+----+------------------------------------------------------+---------------------------+----------------------------+| id | concat(left(name,1),repeat('*',char_length(name)-1)) | mask_inner(phone,3,4,'#') | mask_inner(idcard,4,4,'#') |+----+------------------------------------------------------+---------------------------+----------------------------+| 1 | 红** | 123####8911 | 1234########2134 || 2 | 灰** | 123####8911 | 1234########2134 |+----+------------------------------------------------------+---------------------------+----------------------------+2 rows in set (0.00 sec)
mysql 是没有动态脱敏的,等会玩 PG 的动态脱敏。
PG 数据脱敏
PG 脱敏的话,这篇玩的是 postgresql_anonymizer:https://gitlab.com/dalibo/postgresql_anonymizer
postgresql_anonymizer 支持动态脱敏和静态脱敏,相当于 mysql 的脱敏方式更加灵活。
这个脱敏有 2 种:
动态的用户级脱敏
静态的直接修改值(这个不太建议)
我就演示 动态的用户级脱敏。
编译 postgresql_anonymizer
安装依赖
yum install zlib-devel readline-devel libxml2-devel libxslt-devel openssl-devel perl-devel perl-ExtUtils-Embed python-devel gcc-c++ libicu-devel python3 python3-devel bison flex
拉取包
git clone https://gitlab.com/dalibo/postgresql_anonymizer.gitorwget https://gitlab.com/dalibo/postgresql_anonymizer/-/archive/2.0.0-beta.3/postgresql_anonymizer-2.0.0-beta.3.tar.gztar zxvf postgresql_anonymizer-2.0.0-beta.3.tar.gz
postgresql_anonymizer 依赖于 pgrx ,参考 pgrx 项目:pgcentralfoundation/pgrx
安装Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shsource "$HOME/.cargo/env"
安装 pgrx
cargo install cargo-pgrx --version 0.12.6 --lockedcargo pgrx init
注意版本号要与 Cargo.toml 中的版本对应。
[package]name = "anon"version = "2.0.0-beta.3"[package]name = "anon"version = "2.0.0-beta.3"edition = "2021"[lib]crate-type = ["cdylib", "lib"][features]default = ["pg13"]#pg11 = ["pgrx/pg11", "pgrx-tests/pg11" ]pg12 = ["pgrx/pg12", "pgrx-tests/pg12" ]pg13 = ["pgrx/pg13", "pgrx-tests/pg13" ]pg14 = ["pgrx/pg14", "pgrx-tests/pg14" ]pg15 = ["pgrx/pg15", "pgrx-tests/pg15" ]pg16 = ["pgrx/pg16", "pgrx-tests/pg16" ]pg17 = ["pgrx/pg17", "pgrx-tests/pg17" ]pg_test = [][dependencies]c_str_macro = "1.0.3"chrono = "0.4.37"fake = { version = "2.10.0", features = ["bigdecimal", "chrono", "http", "rust_decimal", "uuid", "time","random_color"] }md-5 = "0.10.6"paste = "1.0"pgrx = "0.12.6"rand = "0.8.5"regex = "1.10.2"[dev-dependencies]pgrx-tests = "0.12.6"[profile.dev]panic = "unwind"[profile.release]panic = "unwind"
装依赖
[postgres@postgresql1 postgresql_anonymizer]$ sudo yum install clang[postgres@postgresql1 postgresql_anonymizer]$ locate libclang.sousr/lib64/clang-private/libclang.so/usr/lib64/clang-private/libclang.so.6/usr/lib64/clang-private/libclang.so.6.0[postgres@postgresql1 postgresql_anonymizer]$ export LIBCLANG_PATH=/usr/lib64/clang-private/libclang.so.6.0[postgres@postgresql1 postgresql_anonymizer]$ export LD_LIBRARY_PATH=/usr/lib64/clang-private:$LD_LIBRARY_PATH
编译
make extension PG_CONFIG=/pgsql/bin/pg_configmake install PG_CONFIG=/pgsql/bin/pg_config
安装
创建一个测试库 encryption
testdb=# create database encryption;CREATE DATABASE
然后把 anon 添加到库级的 session_preload_libraries
testdb=# \c encryption -You are now connected to database "encryption" as user "postgres".encryption=# \dxList of installed extensionsName | Version | Schema | Description---------+---------+------------+------------------------------plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language(1 row)encryption=# show session_preload_libraries;session_preload_libraries---------------------------(1 row)encryption=# alter database encryption set session_preload_libraries = 'anon';ALTER DATABASEencryption=# \q
创建扩展 EXTENSION
[postgres@postgresql1 ~]$ psqlpostgres_dba 6.0 installed. Use ":dba" to see menupsql (16.3)Type "help" for help.testdb=# \c encryption -You are now connected to database "encryption" as user "postgres".encryption=# create extension pgcrypto;CREATE EXTENSIONencryption=# CREATE EXTENSION anon CASCADE;CREATE EXTENSION
anon 依赖于 pgcrypto,所以要先创建 pgcrypto
encryption=# \dxList of installed extensionsName | Version | Schema | Description----------+--------------+------------+---------------------------------------------anon | 2.0.0-beta.4 | public | Anonymization & Data Masking for PostgreSQLpgcrypto | 1.3 | public | cryptographic functionsplpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language(3 rows)
可以看到装上了
简要说说 anon
它支持这八种脱敏策略,功能还是很丰富的,这里我就不详细说
Destruction(重构)
Adding Noise(噪音化)
Randomization(随机化)
Faking(伪造)
Pseudonymization(笔名化)
Generic Hashing(哈希)
Partial scrambling(部分遗漏)
Generalization((泛化))
大家可以看官方文档:https://postgresql-anonymizer.readthedocs.io/en/stable/masking_functions/
主要参数介绍一下:
anon.algorithm:脱敏函数使用的 hash 算法,默认为 sha256,支持 md5、sha256、sha384 和sha512 等,需要超级用户权限才可设置。
anon.salt:pseudonymizing(假名化) functions 使用的盐,一般建议每个数据库设置不一样的值,需要超级用户权限才可设置。
anon.maskschema:用于存储动态脱敏视图的模式,默认名称为 mask。
anon.sourceshema:动态脱敏引擎脱敏表所在的 schema(即 'namespace')。 默认是 public 模式。
anon.restrict_to_trusted_schemas:通过启用此参数,必须使用位于有限命名空间列表中的函数来定义屏蔽规则。默认情况下,只有
anon
schema 是受信任的。默认值:off
动态脱敏
进行如下设置:
alter database encryption set anon.restrict_to_trusted_schemas = on;alter database encryption set anon.salt = 'db_encryption_Secret_Salt';
重新连接数据库后执行如下语句,启动动态脱敏:
select anon.start_dynamic_masking();
要用到 Faking(伪造) 时需要初始化anon插件,我这里等会要用,就初始化一下。
encryption=# SELECT anon.init();NOTICE: The anon extension is already initialized.init------t(1 row)
开启动态脱敏会创建一个叫 mask 的 schema
encryption=# \dnList of schemasName | Owner--------+-------------------anon | postgrespublic | pg_database_owner(2 rows)encryption=# select anon.start_dynamic_masking();start_dynamic_masking-----------------------t(1 row)encryption=# \dnList of schemasName | Owner--------+-------------------anon | postgresmask | postgrespublic | pg_database_owner(3 rows)
创建测试表
create table mask(id int primary key,name varchar(64),phone varchar(11),idcard varchar(64));insert into mask values(1,'红太狼','12345678911','1234567891012134');insert into mask values(2,'灰太狼','12345678911','1234567891012134');
声明脱敏规则
encryption=# SECURITY LABEL FOR anon ON COLUMN mask.id IS 'MASKED WITH VALUE NULL';SECURITY LABELencryption=# SECURITY LABEL FOR anon ON COLUMN mask.name IS 'MASKED WITH FUNCTION anon.fake_first_name()';SECURITY LABEL
定义一个脱敏用户,然后给 mask 用户授权 select
encryption=# CREATE user mask password 'postgres123';CREATE ROLEencryption=# SECURITY LABEL FOR anon ON ROLE mask IS 'MASKED';SECURITY LABELencryption=# grant select on mask to mask;GRANTencryption=# SELECT anon.start_dynamic_masking();NOTICE: schema "mask" already exists, skippingstart_dynamic_masking-----------------------t(1 row)
我们切到 mask 用户去看下这个表
encryption=> select * from mask;id | name | phone | idcard----+-------+-------------+------------------| Jason | 12345678911 | 1234567891012134| Aimee | 12345678911 | 1234567891012134(2 rows)encryption=> select * from mask;id | name | phone | idcard----+--------+-------------+------------------| Joanne | 12345678911 | 1234567891012134| Dawn | 12345678911 | 1234567891012134(2 rows)
可以看到已经遮掩了,还挺牛逼。
它还有几个视图,文档没有描述,我们大概能猜到是干嘛的
view anon.pg_identifiersview anon.pg_masked_rolesview anon.pg_masking_rulesview anon.pg_masksview anon.pg_trusted_functions
anon.pg_identifiers 应该是搜索标识符:https://postgresql-anonymizer.readthedocs.io/en/stable/detection/
encryption=# select * from anon.pg_identifiers;attrelid | attnum | relname | attname | format_type | col_description | indirect_identifier | priority----------+--------+---------+---------+-------------+-----------------+---------------------+----------(0 rows)
anon.pg_masked_roles 长得和 pg_authid 差不多。关于用户定义的动态脱敏的,hasmask 可以看到是否开启动态脱敏。
encryption=# select * from anon.pg_masked_roles where hasmask='t'\gx-[ RECORD 1 ]--+-----------------------------rolname | maskrolsuper | frolinherit | trolcreaterole | frolcreatedb | frolcanlogin | trolreplication | frolconnlimit | -1rolpassword | ********rolvaliduntil |rolbypassrls | frolconfig | {"search_path=mask, public"}oid | 60570hasmask | t
anon.pg_masking_rules 定义的是脱敏规则
encryption=# select * from anon.pg_masking_rules\gx-[ RECORD 1 ]----+--------------------------------------------attrelid | 60561attnum | 1relnamespace | publicrelname | maskattname | idformat_type | integercol_description | MASKED WITH VALUE NULLmasking_function |masking_value | NULLpriority | 100masking_filter | NULLtrusted_schema | f-[ RECORD 2 ]----+--------------------------------------------attrelid | 60561attnum | 2relnamespace | publicrelname | maskattname | nameformat_type | character varying(64)col_description | MASKED WITH FUNCTION anon.fake_first_name()masking_function | anon.fake_first_name()masking_value |priority | 100masking_filter | anon.fake_first_name()trusted_schema | t
anon.pg_masks 和 anon.pg_masking_rules 差不多,也是脱敏规则
encryption=# select * from anon.pg_masks\gx-[ RECORD 1 ]----+--------------------------------------------attrelid | 60561attnum | 1relnamespace | publicrelname | maskattname | idformat_type | integercol_description | MASKED WITH VALUE NULLmasking_function |masking_value | NULLpriority | 100masking_filter | NULLtrusted_schema | f-[ RECORD 2 ]----+--------------------------------------------attrelid | 60561attnum | 2relnamespace | publicrelname | maskattname | nameformat_type | character varying(64)col_description | MASKED WITH FUNCTION anon.fake_first_name()masking_function | anon.fake_first_name()masking_value |priority | 100masking_filter | anon.fake_first_name()trusted_schema | t
anon.pg_trusted_functions; 应该是允许原生PG用来脱敏的函数
encryption=# select * from anon.pg_trusted_functions;schema | function------------+---------------------------------------------------------------pg_catalog | "json_object"(text[])pg_catalog | "left"(text, integer)pg_catalog | "right"(text, integer)pg_catalog | age(timestamp without time zone, timestamp without time zone)pg_catalog | array_to_json(anyarray)pg_catalog | btrim(text)pg_catalog | concat(VARIADIC "any")pg_catalog | date_part(text, timestamp without time zone)pg_catalog | date_trunc(text, timestamp without time zone)pg_catalog | json_build_array()pg_catalog | json_build_object()pg_catalog | jsonb_build_array()pg_catalog | jsonb_build_object()pg_catalog | jsonb_object(text[])pg_catalog | length(text)pg_catalog | lower(text)pg_catalog | ltrim(text)pg_catalog | make_date(year integer, month integer, day integer)pg_catalog | make_time(hour integer, min integer, sec double precision)pg_catalog | md5(text)pg_catalog | now()pg_catalog | random()pg_catalog | regexp_replace(text, text, text)pg_catalog | regexp_replace(text, text, text, text)pg_catalog | repeat(text, integer)pg_catalog | replace(text, text, text)pg_catalog | row_to_json(record)pg_catalog | rtrim(text)pg_catalog | substr(text, integer)pg_catalog | to_char(timestamp without time zone, text)pg_catalog | to_date(text, text)pg_catalog | to_json(anyelement)pg_catalog | to_jsonb(anyelement)pg_catalog | to_number(text, text)pg_catalog | to_timestamp(text, text)pg_catalog | upper(text)(36 rows)
结束语
这篇文章光是安装 postgresql_anonymizer 就编译了很久。
吐槽:喵的,pgrx 真有你的,cargo pgrx init 初始化要下载安装编译 PG 12~17,PG17 少了几个依赖包重新安装编译轮回好几把!!!!!!
好了就写到这啦,下次想玩别的脱敏函数再写叭
参考
https://docs.percona.com/percona-server/8.0/data-masking-overview.html#data-masking-overview
https://docs.percona.com/percona-server/8.0/data-masking-plugin-functions.html#data-masking-plugin-functions
https://postgresql-anonymizer.readthedocs.io/en/stable/INSTALL/




