作者:Masahiko Sawada和Amit Kapila
Mahendra Singh和Sergei Kornilov
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.comhttps://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
=$ CREATE TABLE test (id INT4 GENERATED ALWAYS AS IDENTITY PRIMARY KEY,some_int INT4,some_timestamp TIMESTAMPTZ,other_int INT4,other_timestamp TIMESTAMPTZ);CREATE TABLE=$ INSERT INTO test (some_int, some_timestamp, other_int, other_timestamp)SELECTrandom() * 500000000,'2000-01-01'::DATE + '20 years'::INTERVAL * random(),random() * 500000000,'1970-01-01'::DATE + '20 years'::INTERVAL * random()FROMgenerate_series(1,100000000) i;INSERT 0 100000000
=$ SELECT * FROM test LIMIT 10;id | some_int | some_timestamp | other_int | other_timestamp----+-----------+-----------------------------+-----------+-----------------------------1 | 179275930 | 2010-01-28 23:02:57.0048+01 | 31223069 | 1975-10-21 03:30:13.536+012 | 119009254 | 2019-05-01 14:40:19.5168+02 | 390536066 | 1974-02-18 16:12:31.7952+013 | 153965899 | 2010-04-26 00:36:46.3968+02 | 109395281 | 1985-10-09 01:30:36.0288+014 | 123106154 | 2006-06-28 18:59:21.0624+02 | 399537003 | 1982-12-08 21:13:32.5056+015 | 338157258 | 2006-11-04 07:21:34.7328+01 | 487378393 | 1975-02-14 05:59:28.7232+016 | 108837322 | 2006-10-10 04:07:35.2704+02 | 53539283 | 1987-05-17 00:59:49.5744+027 | 434671405 | 2011-04-09 00:21:43.4304+02 | 374841058 | 1980-05-13 17:12:37.1808+028 | 407587896 | 2013-08-02 15:26:41.3376+02 | 180180561 | 1985-01-06 04:22:03.6768+019 | 450852732 | 2008-10-27 18:14:00.4992+01 | 81128068 | 1975-06-26 01:55:29.8848+0110 | 306987401 | 2013-12-17 08:39:19.1232+01 | 28668776 | 1970-08-01 15:43:29.0208+01(10 ROWS)
=$ CREATE INDEX i1 ON test (some_int);CREATE INDEX=$ CREATE INDEX i2 ON test (some_timestamp);CREATE INDEX=$ CREATE INDEX i3 ON test (other_int);CREATE INDEX=$ CREATE INDEX i4 ON test (other_timestamp);CREATE INDEX=$ CREATE INDEX i5 ON test (some_int, some_timestamp);CREATE INDEX=$ CREATE INDEX i6 ON test (other_int, other_timestamp);CREATE INDEX
=$ SELECT c.relname, c.relkind, pg_size_pretty(pg_relation_size(c.oid))FROM pg_class cWHERE c.relname = 'test' ORc.oid IN ( SELECT i.indexrelid FROM pg_index i WHERE i.indrelid = 'test'::regclass );relname | relkind | pg_size_pretty-----------+---------+----------------test | r | 5744 MBtest_pkey | i | 2142 MBi1 | i | 2142 MBi2 | i | 2142 MBi3 | i | 2142 MBi4 | i | 2142 MBi5 | i | 3004 MBi6 | i | 3004 MB(8 ROWS)
=$ DELETE FROM test WHERE random() < 0.5;
这样索引(和表)将需要进行一些清理。
=$ vacuum (verbose ON, analyze ON, parallel 0) test;
之后,重新创建了整个内容,然后运行:
=$ SET max_parallel_maintenance_workers = 2;=$ vacuum (verbose ON, analyze ON) test;
最后,重新创建:
=$ SET max_parallel_maintenance_workers = 8;=$ vacuum (verbose ON, analyze ON) test;
日志显示,对于连续vacuum:
=$ vacuum (verbose ON, analyze ON, parallel 0) test;psql:test1.sql:2: INFO: vacuuming 'public.test'...CPU: USER: 663.45 s, system: 87.05 s, elapsed: 1505.04 s.psql:test1.sql:2: INFO: analyzing 'public.test'psql:test1.sql:2: INFO: 'test': scanned 30000 OF 735295 pages, containing 2040581 live ROWS AND 0 dead ROWS;30000 ROWS IN sample, 50014300 estimated total ROWSVACUUMTIME: 1505238.738 ms (25:05.239)
2位工人(进程的形象描述)的vacuum清理工作:
=$ SET max_parallel_maintenance_workers = 2;SET=$ vacuum (verbose ON, analyze ON) test;psql:test2.sql:3: INFO: vacuuming 'public.test'psql:test2.sql:3: INFO: launched 2 parallel vacuum workers FOR INDEX vacuuming (planned: 2)...CPU: USER: 119.29 s, system: 43.63 s, elapsed: 694.13 s.psql:test2.sql:3: INFO: analyzing 'public.test'psql:test2.sql:3: INFO: 'test': scanned 30000 OF 735295 pages, containing 2039828 live ROWS AND 0 dead ROWS;30000 ROWS IN sample, 49995844 estimated total ROWSVACUUMTIME: 694336.035 ms (11:34.336)
=$ SET max_parallel_maintenance_workers = 8;SET=$ vacuum (verbose ON, analyze ON) test;psql:test3.sql:3: INFO: vacuuming 'public.test'psql:test3.sql:3: INFO: launched 6 parallel vacuum workers FOR INDEX vacuuming (planned:6)CPU: USER: 134.24 s, system: 51.37 s, elapsed: 776.12 s.psql:test3.sql:3: INFO: analyzing 'public.test'psql:test3.sql:3: INFO: 'test': scanned 30000 OF 735295 pages, containing 2040985 live ROWS AND 0 dead ROWS; 30000 ROWS IN sample, 50024202 estimated total ROWSVACUUMTIME: 776326.118 ms (12:56.326)
看起来需要更多的调整才能与更多的工作人员一起运行,但是在我的测试案例中-只有7个索引,因此不会有太大的收获。
另外–速度似乎最好由2个工作人员来完成,但这可能与服务器上的一些负载有关。无论如何,并行处理索引时,vacuum都快得多。
关于我们
中国开源软件推进联盟PostgreSQL分会(简称:PG分会)于2017年成立,由国内多家PG生态企业所共同发起,业务上接受工信部产业发展研究院指导。PG分会致力于构建PG产业生态,推动PG产学研用发展,是国内一家PG行业协会组织。

最后修改时间:2020-02-19 09:14:53
文章转载自开源软件联盟PostgreSQL分会,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




