作者
digoal
日期
2019-06-15
标签
PostgreSQL , 负载回放 , SQL回放 , SQL仿真
背景
我们想在测试环境中回放生产库的负载,分析一些性能问题。
我们想把ORACLE迁移到PG,正式割接之前,我们想知道在PG中回放于ORACLE生产环境中一模一样负载的SQL,在PG数据库上的性能表现、LOAD如何?
SQL仿真、SQL回放、负载回放等叫法,说的都是以上功能。
首先要采集源端SQL负载。然后要能原样还原SQL负载(包括连接数、并发请求、SQL请求的顺序,事务提交顺序都要保持和源端一致)。
目前有一些PG项目支持PG的源端负载分析,目标端原样回放。
sql仿真项目
今天在PG天天象上活动中,ODOO的同学提供了一个PG WIKI网站里面有很多SQL仿真的项目,多多交流,视野开阔。
https://wiki.postgresql.org/wiki/Statement_Playback
tsung
Project home page: http://tsung.erlang-projects.org/
Tsung implements a multi-threaded model that tracks which transactions each query belonged to and runs them with the same concurrency as the original.
Tsung also comes with a recorder which is a PostgreSQL proxy. You can connect it to your server, connect your client to it, and let it record a session at a time. Then in the configuration you get to choose how many of each sessions you want to mix.
Note that Tsung is not really designed to replay logs. It's designed for artificial workload generation, so it's very limited in the number of sessions it can replay ... not more than 200 or so. Depending on your real workload, this may be an insurmountable obstacle.
Tsung tutorial on load-testing Postgresql servers:
Load-testing a Postgresql server with Tsung
pgTsung: app-specific testing methodology
pgTsung: stage 1
pgTsung: stage 2
playr
Project home page: http://github.com/myYearbook/playr
Designed to identify potential issues resulting from software or hardware upgrades on myYearbook.com's high-volume databases, Playr is able test an entire application and provide the administrator with a report detailing the performance and error-related divergence of a Postgres configuration from the production workload. This sounds good, but how does it work? Playr works by capturing your production database workload, analyzing it, and processing it into a set of binary capture files. Once the workload capture and processing has been completed, replay can be performed using one or more driver systems. By using the binary capture files, Playr is able to recreate the production workload exactly; all concurrency, timing, and commit ordering remain identical to production.
Playr is not designed to work with lesser hardware than the machine where the log files were originally taken from. It is not designed to be a benchmark application in the traditional sense. If the new hardware can not keep up with the timing, Playr will give up its stress test and let you know that it fell behind.
pgreplay
Project home page: http://pgreplay.projects.postgresql.org/
pgreplay reads a PostgreSQL log file (not a WAL file), extracts the SQL statements and executes them in the same order and relative time against a PostgreSQL database cluster. This makes it straightforward to capture the data needed to drive it from a production server. pgreplay is single-threaded, but uses asynchronous query processing, so multiple connections can be handled simultaneously. pgreplay will use as many connections as the original run did, and query order and timing are retained. If the execution of statements gets behind schedule, warning messages are issued that indicate that the server cannot handle the load in a timely fashion. The idea is to replay a real-world database workload as exactly as possible.
Features:
Should compile and run on any platform that PostgreSQL supports
Can replay the workload at different speeds
Can parse "stderr" and "csvlog" log files
Can save workload to replay in "replay file" for reuse
Limitations:
Statements that are not logged, such as COPY, will not be played back.
参考
https://wiki.postgresql.org/wiki/Statement_Playback
https://github.com/gocardless/pgreplay-go
https://github.com/laurenz/pgreplay
http://manpages.org/pgreplay
https://www.commandprompt.com/blog/postgres-load-testing-pgreplay/
PostgreSQL 许愿链接
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.
9.9元购买3个月阿里云RDS PostgreSQL实例
PostgreSQL 解决方案集合
德哥 / digoal's github - 公益是一辈子的事.





