暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 未来OLTP场景优化 - 包括walinsert、buffer manage、async protocol、 CSN 优化原理 - 高并发场景优化

原创 digoal 2022-01-20
823

作者

digoal

日期

2020-08-12

标签

PostgreSQL , GetSnapshotData , 高并发


为什么要优化?
- 现在的机器核数都很多, 用户并发也很高, 由于每次获取事务快照都要掉用GetSnapshotData, 对procArray加共享锁, 遍历procArray后释放. 并发越高, procArray越大, 越耗费CPU.
- 事务结束时要对procArray加排他锁, 如果有高并发的查询, 会导致排他与共享锁冲突概率增加, 从而影响性能. 所以高并发小事务混合读写场景性能影响较为严重.

《PostgreSQL 14 GetSnapshotData 高并发优化》
《PostgreSQL 20200819当天代码 - 14 对比 13 高并发性能优化 数据对比 - get snapshot improve》

CSN 优化原理:

https://postgrespro.ru/media/2019/10/26/future_is_csn.pdf

  • Array of active transaction ids is stored in shared memory.
  • GetSnapshotData() scans all the active xids while holding shared ProcArrayLock.
  • Assigning of new xid doesn’t require ProcArrayLock.
  • Clearing active xid requires exclusive ProcArrayLock.
  • 9.6 comes with “group clear xid” optimization.
  • Multiple xids of finished transactions could be cleared using single exclusive ProcArrayLock.
  • Nowadays multi-core systems running can run thousands of backends simultaneously. For short queries GetSnapshotData() becomes just CPU expensive.
  • LWLock subsystem is just not designed for high concurrency. In particular, exclusive lock waits could have infinite starvation. Therefore, it’s impossible to connect while there is high flow of short readonly queries.
  • In the mixed read-write workload, ProcArrayLock could become a bottleneck.
  • Taking snapshots is cheaper. It’s even possible to make it lockless.
  • CSN snapshots are more friendly to distributed systems. Distributed visibility techniques like incremental snapshots or Clock-SI assumes that snapshot is represented by single number.
  • Make both XID and CSN 64-bit
  • Add 64-bit xid_epoch, multixact_epoch and csn_epoch to page header.
  • Allocate high bit of xmin and xmax for CSN flag.
  • Actual xid or csn stored in xmin or xmax should be found as corresponding epoch plus xmin or xmax.
  • We still can address 2^31 xids from xmin and xmax as we did before.
  • Wraparound is possible only inside single page. And it could be resolved by single page freeze.
  • Rewrite XID to CSN instead of seƫng “committed” hint bit.
  • Lockless snapshot taking
  • Buffer manager – slow hash-table, pin, locks etc.
  • Synchronous protocol.
  • Executor.
  • Slow xid allocation – a lot of locks.
SELECT val FROM t WHERE id IN (:id1, ... :id10) –  
150K per second = 1.5M key-value pairs per second, no gain.  
Bottleneck in buffer manager.  
SELECT 1 with CSN-rewrite patch – 3.9M queries per second.  
Bottleneck in Protocol and executor are bottlenecks.  
SELECT txid_current() – 390K per second.   
Bottleneck in locks.  
  • True in-memory engine without buffer manager.
  • Asynchronous binary protocol for processing more short queries.
  • Executor improvements including JIT-compilation.
  • Lockless xid allocation.

关联文章:
《为什么高并发读写的性能会下降? - tuple可见性判断, 事务快照损耗 - polardb hlc O(1)优化,无需快照判定可见行》
《每天5分钟,PG聊通透 - 系列1 - 热门问题 - 链接、驱动、SQL - 第4期 - 为什么增加连接不能无限提高TPS或QPS? 配置多少个链接合适?》
《PostgresPro buildin pool(内置连接池)版本 原理与测试》
《阿里云 RDS PostgreSQL 高并发特性 vs 社区版本 (1.6万并发: 3倍吞吐,240倍响应速度)》

期望 PostgreSQL 增加什么功能?

PolarDB for PostgreSQL云原生分布式开源数据库

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论