暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 敏感信息遮掩插件 anon - security label provider - anon

digoal 2019-11-15
1220

作者

digoal

日期

2019-11-15

标签

PostgreSQL , 敏感信息遮掩 , security label provider , anon


背景

PostgreSQL security label功能,是一个安全框架,开放了security label provider接口,通过实现这个接口,可以实现数据的细粒度安全控制,例如对某些用户,只能看到经过混淆的数据。

  • load security label lib
  • user(打了标的) -> search -> security label(object) -> security 混淆函数 -> 返回混淆结果

例如selinux lable lib

https://www.postgresql.org/docs/12/sepgsql.html

```
shared_preload_libraries = 'sepgsql'

SECURITY LABEL FOR selinux ON TABLE mytable IS 'system_u:object_r:sepgsql_table_t:s0';
```

例如,禁止直接访问credit字段

```
postgres=# CREATE TABLE customer (
cid int primary key,
cname text,
credit text
);
CREATE TABLE

postgres=# SECURITY LABEL ON COLUMN customer.credit
IS 'system_u:object_r:sepgsql_secret_table_t:s0';
SECURITY LABEL
```

允许通过函数访问credit

```
postgres=# CREATE FUNCTION show_credit(int) RETURNS text
AS 'SELECT regexp_replace(credit, ''-[0-9]+$'', ''-xxxx'', ''g'')
FROM customer WHERE cid = $1'
LANGUAGE sql;
CREATE FUNCTION

postgres=# SECURITY LABEL ON FUNCTION show_credit(int)
IS 'system_u:object_r:sepgsql_trusted_proc_exec_t:s0';
SECURITY LABEL
```

The above operations should be performed by an administrative user.

```
postgres=# SELECT * FROM customer;
ERROR: SELinux: security policy violation

postgres=# SELECT cid, cname, show_credit(cid) FROM customer;
cid | cname | show_credit
-----+--------+---------------------
1 | taro | 1111-2222-3333-xxxx
2 | hanako | 5555-6666-7777-xxxx
(2 rows)
```

In this case, a regular user cannot reference customer.credit directly, but a trusted procedure show_credit allows the user to print the credit card numbers of customers with some of the digits masked out.

查询打了哪些标

```
postgres=# select * from pg_seclabel;
objoid | classoid | objsubid | provider | label
--------+----------+----------+----------+-------
(0 rows)

postgres=# select * from pg_seclabels;
objoid | classoid | objsubid | objtype | objnamespace | objname | provider | label
--------+----------+----------+---------+--------------+---------+----------+-------
(0 rows)
```

除了selinux , pg外围也有一些利用security label接口来实现敏感信息遮掩的插件,例如anon。

anon - security label provider for 敏感信息遮掩

https://postgresql-anonymizer.readthedocs.io/en/latest/

首先是给用户打标,表示这个用户查询数据时需要加载security label provider -> anon.

然后给对象打标,凡事打标了的用户,查询打标了的对象,都会过对象打标时对应的敏感信息遮掩函数,过滤敏感信息。

Example

=# SELECT * FROM people; id | fistname | lastname | phone ----+----------+----------+------------ T1 | Sarah | Conor | 0609110911

Step 1 : Activate the dynamic masking engine

=# CREATE EXTENSION IF NOT EXISTS anon CASCADE; =# SELECT anon.start_dynamic_masking();

Step 2 : Declare a masked user

```
=# CREATE ROLE skynet LOGIN;

=# SECURITY LABEL FOR anon ON ROLE skynet IS 'MASKED';
```

Step 3 : Declare the masking rules

anon.fake_last_name和anon.partial是两个遮掩算法函数。

对people.lastname打安全标签,使用anon.fake_last_name安全过滤函数。相当于查询的时候,这个字段的值会被fake_last_name函数清洗一下再进入下一步

```
=# SECURITY LABEL FOR anonON COLUMN people.lastname
-# IS 'MASKED WITH FUNCTION anon.fake_last_name()';

=# SECURITY LABEL FOR anon ON COLUMN people.phone
-# IS 'MASKED WITH FUNCTION anon.partial(phone,2,$$**$$,2)';
```

Step 4 : Connect with the masked user

=# \! psql peopledb -U skynet -c 'SELECT * FROM people;' id | fistname | lastname | phone ----+----------+-----------+------------ T1 | Sarah | Stranahan | 06******11

anon敏感信息遮掩算法函数支持情况

对象被实施于某个遮掩函数,相当于查询对象内容时,先过一下遮掩函数,再返回。

内置遮掩函数

https://postgresql-anonymizer.readthedocs.io/en/latest/masking_functions/

  • Adding Noise
  • Shuffling
  • Randomization
  • Faking
  • Partial scrambling

算法选择

  • For names and other 'direct identifiers' , faking is often usefull
  • Shuffling is convienient for foreign keys
  • Adding noise is interesting for numeric values and dates
  • Partial Scrambling is perfect for email address and phone numbers
  • etc.

例如:

添加噪音

  • anon.add_noise_on_numeric_column(table, column,ratio) if ratio = 0.33, all values of the column will be randomly shifted with a ratio of +/- 33%
  • anon.add_noise_on_datetime_column(table, column,interval) if interval = '2 days', all values of the column will be randomly shifted by +/- 2 days

随机值混淆

  • anon.random_date() returns a date
  • anon.random_date_between(d1,d2) returns a date between d1 and d2
  • anon.random_int_between(i1,i2) returns an integer between i1 and i2
  • anon.random_string(n) returns a TEXT value containing n letters
  • anon.random_zip() returns a 5-digit code
  • anon.random_phone(p) return a 8-digit phone with p as a prefix

显示部分

  • anon.partial('abcdefgh',1,'xxxx',3) will return 'axxxxfgh';
  • anon.email('daamien@gmail.com') will becomme 'da@gm**.com'

赝品,需要加载赝品库,随机使用赝品库中的内容代替

Once the fake data is loaded you have access to 12 faking functions:

  • anon.fake_first_name() returns a generic first name
  • anon.fake_last_name() returns a generic last name
  • anon.fake_email() returns a valid email address
  • anon.fake_city() returns an existing city
  • anon.fake_city_in_country(c) returns a city in country c
  • anon.fake_region() returns an existing region
  • anon.fake_region_in_country(c) returns a region in country c
  • anon.fake_country() returns a country
  • anon.fake_company() returns a generic company name
  • anon.fake_iban() returns a valid IBAN
  • anon.fake_siret() returns a valid SIRET
  • non.fake_siren() returns a valid SIREN

For TEXT and VARCHAR columns, you can use the classic Lorem Ipsum generator:

  • anon.lorem_ipsum() returns 5 paragraphs
  • anon.lorem_ipsum(2) returns 2 paragraphs
  • anon.lorem_ipsum( paragraphs := 4 ) returns 4 paragraphs
  • anon.lorem_ipsum( words := 20 ) returns 20 words
  • anon.lorem_ipsum( characters := 7 ) returns 7 characters

自定义遮掩函数

Write your own Masks !

You can also use you own functions as a mask. The function must either be destructive (like [Partial Scrambling]) or insert some randomness in the dataset (like [faking]).

For instance, if you wrote a function foo(), you can apply it like this:

COMMENT ON COLUMN player.score IS 'MASKED WITH FUNCTION foo()';

关于类型转换

如果混淆函数返回的类型和原列类型不一致,可以转换后再返回,例如

The faking functions will return values in TEXT data types. The random functions will return TEXT, INTEGER or TIMESTAMP WITH TIMEZONE.

If the column you want to mask is in another data type (for instance VARCHAR(30), then you need to add an explicit cast directly in the COMMENT declaration, like this:

=# COMMENT ON COLUMN clients.family_name -# IS 'MASKED WITH FUNCTION anon.fake_last_name()::VARCHAR(30)';

参考

https://postgresql-anonymizer.readthedocs.io/en/latest/dynamic_masking/

https://github.com/lacanoid/pgddl

https://pgxn.org/dist/postgresql_anonymizer/0.5.0/

https://www.postgresql.org/docs/current/sql-security-label.html

https://www.postgresql.org/docs/current/sepgsql.html

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论