你应该收集系统统计信息吗？

赵勇 2025-02-08

215

介绍

你应该收集系统统计信息吗? 如果你部署了Oracle数据库，那么你对这个问题会有自己的看法，而Oracle会给出什么建议呢？在开始这个问题之前，我先简要介绍一下现有的Oracle相关资料，然后再讨论系统统计信息如何影响SQL执行计划。如果你希望直接了解我的观点，那么请查看下面的建议部分。

现在建议的简要说明

Oracle白皮书和文档包含了部分建议，但是存在一些差异，并且随着时间的推移，也有一些变化。Oracle引入DBMS_STATS.GATHER_SYSTEM_STATS还要回到2001时发布的Oracle Database 9i，在Database Performance Guide and Reference 中其清晰的描述收集系统统计信息是极力推荐的。在后续版本的文档中，直到近期，一直有点模棱两可了。Oracle还添加了EXADATA选项，以考虑Oracle EXADATA数据库的特性。

另一方面，在Oracle Optimizer white papers中，其建议使用默认的系统统计信息，并且不要手动收集它。Recent updates to the documentation 中延续了这一表述。如果系统统计信息未被收集，优化器会在数据库启动时测量CPU速度，并在计算IO成本时，使用默认的指标。

系统统计信息是如何影响SQL执行计划的

系统统计信息背后的理念是测量系统CPU和存储子系统（比如NAS，SAN，JBOD或FLASH）的性能，当评估候选SQL执行计划的成本时，利用该信息。
当SQL执行时，会消耗CPU和存储资源（在部分情况下）。一个典型的查询会有多个潜在SQL执行计划，每一个执行计划都会消耗不同比例的CPU和存储IO。Oracle优化器的工作（笼统地说）就是找到预估执行花费时间最少的执行计划。而实际上，它是使用一个称为cost的内部度量单位来比较候选的执行计划。优化器会选择（在它考虑到的所有执行计划中）预估成本最少的执行计划。如果数据库知道存储和CPU实际上有多快，那么他就可以对每一个候选执行计划做出更精细的评价。

设想一个有三个可能的执行计划的查询。每个计划使用了不同量的CPU和IO。下图展示了计划1被预期使用了大量的CPU，而非常少的IO资源。对于这个例子，我们假设计划1具有最小的评估成本并且已经被优化器所选中。
图片.png

考虑一个有极高性能的存储架构的系统。它有可能改变在CPU和IO之间的相对性能平衡。收集系统统计信息会让优化器将其考虑进来。在我们的例子中，高性能存储的存在会显著降低计划2和计划3的成本，而计划1只降低了少量（因为它使用了较少的IO资源）。这个改变，足以使计划3成为最佳选择，它当下具有最低的预估成本。
图片.png

在有快速IO架构的系统上，我们可以预期，使用系统统计信息会略微偏向于优先使用表扫描，而不是使用索引。
图片.png

建议

如果收集统计信息有益于你的工作负载，并且你也乐于管理它。那么你无需改变。反之，如果你正在做出决策，需要选择是，还是不收集它们，那么在大多数情况下，你应该使用默认的，不收集系统统计信息。

有一个例外情况需要考虑。在Oracle Exadata主机上，若数据库支撑着一个纯数据仓库的工作负载，其可以从使用EXADATA选项来收集系统统计信息中受益。这会使表扫描更有可能，当然，这直接发挥了该架构的主要优势之一。无论如何，如果工作负载是混合的，或者你所处的位置无法测试使用EXADATA系统统计信息的效果，那么即便是在该平台上，也应遵守使用默认值。

重要的是要记住，如果你改变了使用系统统计信息的方式，那么这会影响SQL执行计划。出于这个原因，只有当您处于决策点并且能够测试更改的效果（或者至少在一个工作负载的性能发生不受欢迎的变化时，您可以缓解风险的位置时）时，才应该进行更改。

建议的证明

建议总会引起争议。我很清楚，一些DBA和架构师非常乐意使用系统统计数据，并认为它们非常有益。这一观点没有错，毫无疑问，在某些情况下，系统统计数据打破了平衡，提高了某些查询的性能。然而，并非所有查询都处于同一个临界点，因此很可能对某些查询来说效果不佳。一组特定的系统统计数据也可能不会使数据库处理的所有工作负载（如ETL/ELT和OLTP）受益。最终的结果是，收集系统统计数据不太可能是以一致的方式全面提供理想性能的尚方宝剑：您可能仍然需要调整一些查询。当然，您可能从未想过系统统计信息是尚方宝剑，但维护它们至少需要一些管理或过程开销，并且根据您的使用方式，它们可能会在SQL执行计划中引入一些可变性（在不同系统之间或在单个系统上随时间变化）。该建议允许您避免潜在的计划可变性和管理开销。

如果你正在测试一个收集系统统计数据可以提高性能的工作负载，那么花一些时间找出原因是值得的。不太可能有全面的改善。更有可能的是，少数重要查询的性能有所提高，或者可能存在一组具有共同特征的查询（可能访问一组特定的表或使用特定的谓词组合）。你可能会发现，最好将自己从管理系统统计数据中解放出来，而是使用Oracle提供的工具来调整性能不如你预期的查询。

据我所知，相关的MOS和文档已与此消息保持一致。如果你发现任何与此相矛盾的地方，请发表评论。

欢迎评论！

原文链接：https://blogs.oracle.com/optimizer/post/should-you-gather-system-statistics
Should You Gather System Statistics?
January 1, 2020 | 5 minute read
Nigel Bayliss
Product Manager

Introduction

Should you gather system statistics? If you deploy Oracle Databases then you will have an opinion on the answer, but what does Oracle recommend? Before I get to that, I’ll present a brief survey of existing Oracle collateral and then cover how systems statistics affect SQL execution plans. If you want to skip directly to the point, then check out the recommendation section, below.

A Brief Survey of Existing Recommendations

Oracle white papers and documentation include some recommendations, but there are some differences and there has been some variation over time. Oracle introduced DBMS_STATS.GATHER_SYSTEM_STATS back in 2001 with Oracle Database 9i and in the Database Performance Guide and Reference it clearly states that gathering system statistics is highly recommended. Later versions of the documentation have been, until recently, a little more equivocal. Oracle added an EXADATA option to take into account the characteristics of Oracle Exadata Database Machines.

Oracle Optimizer white papers, on the other hand, recommend using system statistics defaults and not gathering them manually. Recent updates to the documentation have aligned with this message. If systems statistics are not gathered, the optimizer measures CPU speed when the database starts and IO costing is calculated using default metrics.

How System Statistics Affect SQL Execution Plans

The concept behind system statistics is to measure the performance of system CPU and the storage subsystem (such as NAS, SAN, JBOD or flash) and use this information when costing alternative SQL execution plans.

A query consumes CPU and (in many cases) storage resources when it executes. A typical query will have many potential SQL execution plans and each plan may consume a different proportion of CPU versus storage IO. The Oracle Optimizer’s job is (broadly speaking) to find the plan that is estimated to take the least amount of time to execute. In reality, it compares alternative plans using an internal metric called cost. The optimizer chooses the plan with the lowest estimated cost (out of all the plans it considers). If the database knows how fast storage and CPU actually is, then it can make finer judgements about the cost of each alternative plan.

Consider a query that has three possible plans. Each plan uses a different amount of CPU and IO. The diagram below illustrates that Plan 1 is expected to use a large amount of CPU and very little IO. For the sake of this example, we will assume that Plan 1 is deemed to have the lowest estimated cost and has duly been chosen by the optimizer.
图片.png
Consider now a system that has a particularly high performance storage infrastructure. It has the potential to shift the relative performance balance between the CPU and IO. Gathering system statistics will allow the optimizer to take this into account. In our example, the presence of high performance storage lowers the relative cost of Plan 2 and Plan 3 significantly, and Plan 1 by a small amount (because it uses less IO). This shift is enough to make Plan 3 the best choice; it now has the lowest estimated cost.
图片.png
On systems with fast IO infrastructure, we can expect the use of system statistics to tip the balance a little towards using table scans in preference to indexes.

Recommendation

If gathering system statistics benefits your workload and you are happy to manage them, then you have no reason to change. Alternatively, if you are at a decision point and you need to choose whether to gather them or not, then in most cases you should use the defaults and not gather system statistics.

There is an exceptional case to consider. Databases supporting a pure data warehouse workload on an Oracle Exadata Database Machine can benefit from system statistics gathered using the EXADATA option. It will make table scans more likely and, of course, this plays directly into one of the architecture’s major strengths. Nevertheless, if the workload is mixed or you are not in a position to test the effect of using EXADATA system statistics, then stick to the defaults even on this platform.

It is important to remember that if you change the way you use system statistics, then this can affect SQL execution plans. For this reason, you should only make a change if you are at a decision point and you are able to test the effect of the change (or at least in a position to mitigate the risk of undesirable changes in workload performance).

Justifying the Recommendation

The recommendation is always going to be controversial. I am well aware that some DBAs and architects are very happy using system statistics and consider them highly beneficial. There is nothing wrong with this position and there are undoubtedly cases where system statistics have tipped the balance and improved the performance of some queries. However, this tipping-point is not in the same place for all queries, so it is quite possible that it will not work out well for some. It is also possible that a particular set of system statistics will not benefit all workloads handled by the database (such as ETL/ELT and OLTP). The net result is that gathering system statistics is unlikely to be the silver bullet that delivers ideal performance across the board in a consistent manner: you might still have to tune some queries. Of course, you probably never expected system statistics to be a silver bullet, but there is at least some management or procedural overhead required to maintain them and, depending on how you use them, they can introduce some variability in SQL execution plans (between different systems or over time on a single system). The recommendation allows you to avoid potential plan variability and management overhead.

If you are testing a workload where gathering system statistics has improved performance, it is worth spending some time to find out why. It is unlikely that there has been an across-the-board improvement. It is more likely that the performance of a small number of significant queries has improved or there are perhaps there are a group of queries with a common pattern (maybe accessing a particular set of tables or using a particular combination of predicates). You might find that it is better to free yourself from managing system statistics and, instead, use the tools that Oracle provides you with to tune the queries that are not performing as well as you want.

To the best of my knowledge, relevant MOS notes and the docs have been aligned with this message. If you spot anything contradicting this, then please post a comment.

All comments welcome!

系统统计信息

文章转载自赵勇，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。