作者
digoal
日期
2020-07-27
标签
PostgreSQL , 并行 , IO , batch , chunk
背景
并行计算, seq scan, 每个worker逐一block扫描, 导致可能文件系统read ahead(预读)无实际效果, IO打到块设备变得离散, 影响性能.
使用chunk 方式, 批量IO, 性能更好.
```
Allocate consecutive blocks during parallel seqscans
author David Rowley drowley@postgresql.org
Sun, 26 Jul 2020 17:02:45 +0800 (21:02 +1200)
committer David Rowley drowley@postgresql.org
Sun, 26 Jul 2020 17:02:45 +0800 (21:02 +1200)
commit 56788d2156fc32bd5737e7ac716d70e6a269b7bc
tree eae80693ce8db12d01d4dde7e429d35e253d2d2e tree | snapshot
parent 11a68e4b53ffccf336a2faf5fa380acda28e880b commit | diff
Allocate consecutive blocks during parallel seqscans
Previously we would allocate blocks to parallel workers during a parallel
sequential scan 1 block at a time. Since other workers were likely to
request a block before a worker returns for another block number to work
on, this could lead to non-sequential I/O patterns in each worker which
could cause the operating system's readahead to perform poorly or not at
all.
Here we change things so that we allocate consecutive "chunks" of blocks
to workers and have them work on those until they're done, at which time
we allocate another chunk for the worker. The size of these chunks is
based on the size of the relation.
Initial patch here was by Thomas Munro which showed some good improvements
just having a fixed chunk size of 64 blocks with a simple ramp-down near
the end of the scan. The revisions of the patch to make the chunk size
based on the relation size and the adjusted ramp-down in powers of two was
done by me, along with quite extensive benchmarking to determine the
optimal chunk sizes.
For the most part, benchmarks have shown significant performance
improvements for large parallel sequential scans on Linux, FreeBSD and
Windows using SSDs. It's less clear how this affects the performance of
cloud providers. Tests done so far are unable to obtain stable enough
performance to provide meaningful benchmark results. It is possible that
this could cause some performance regressions on more obscure filesystems,
so we may need to later provide users with some ability to get something
closer to the old behavior. For now, let's leave that until we see that
it's really required.
Author: Thomas Munro, David Rowley
Reviewed-by: Ranier Vilela, Soumyadeep Chakraborty, Robert Haas
Reviewed-by: Amit Kapila, Kirk Jamison
Discussion: https://postgr.es/m/CA+hUKGJ_EErDv41YycXcbMbCBkztA34+z1ts9VQH+ACRuvpxig@mail.gmail.com
```
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=56788d2156fc32bd5737e7ac716d70e6a269b7bc
PostgreSQL 许愿链接
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.
9.9元购买3个月阿里云RDS PostgreSQL实例
PostgreSQL 解决方案集合
德哥 / digoal's github - 公益是一辈子的事.





