Oracle分区表上的统计信息（四）

原创赵勇云和恩墨技术服务团队 2021-10-31

1828

在上一篇贴子中，我演示说明了当你信赖Oracle在分区或子分区上收集统计信息，然后向上在表的更高级别上聚合生成全局统计信息时会遇到的问题。除非所有相关层级的统计信息均已具备，否则聚合并不会发生。比如，你已经拥有了4个子分区中3个的统计信息，但并不会在相关的分区上聚合统计信息，除非你在第4个子分区上也收集了统计信息之后。Randolf Geist 就发表过一篇有关如何避免这类问题的贴子：
“为了解决因添加初始缺少统计信息的分区，而会破坏聚合统计信息的问题，需要注意的是，对新添加的子分区应立即更新其统计信息（在这种情况下，为0行）–由于子分区是空的，因此不需要花费很多时间，但它解决了聚合统计信息的问题”

这也正是在我们的系统中所做的，但是Oracle最近版本中引入的一个变化，导致了引发这一系列文章的问题。。。

让我们先从一个空表（与第一篇贴子中的定义没有区别）开始。由于我们对于没有统计信息是如此的偏执，我们会在分区级收集统计信息，尽管这时表是空的。为了节省一点字数，我没有打算指定分区名称，但是在真实的系统中，我们会指定的。但不管怎样，我们会看到一样的问题和结果。

SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'PARTITION');

PL/SQL procedure successfully completed.

SQL> select     table_name, global_stats, last_analyzed, num_rows
  2  from dba_tables
  3  where table_name='TEST_TAB1'
  4  and owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     GLO LAST_ANALYZED          NUM_ROWS
------------------------------ --- -------------------- ----------
TEST_TAB1                      NO  28-FEB-2010 08:04:24          0

虽然不是真正的全局统计信息，但没关系，我们知道这一点。我们也清楚统计信息，是对此时表上没有数据的反映。而当我们观察分区级的统计信息时：-

SQL> select     table_name, partition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_partitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     PARTITION_NAME                 GLO LAST_ANALYZED          NUM_ROWS
------------------------------ ------------------------------ --- -------------------- ----------
TEST_TAB1                      P_20100131                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100201                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100202                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100203                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100204                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100205                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100206                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100207                     YES 28-FEB-2010 08:04:24          0

8 rows selected.

它们有真正的全局统计信息，尽管这时没有数据，但至少我们有了一些反映该情况的统计信息。再来观察子分区上的统计信息：-

SQL> select     table_name, subpartition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_subpartitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     SUBPARTITION_NAME              GLO LAST_ANALYZED          NUM_ROWS
------------------------------ ------------------------------ --- -------------------- ----------
TEST_TAB1                      P_20100131_GROT                NO
TEST_TAB1                      P_20100131_HALO                NO
TEST_TAB1                      P_20100131_JUNE                NO
TEST_TAB1                      P_20100131_OTHERS              NO
TEST_TAB1                      P_20100201_GROT                NO
TEST_TAB1                      P_20100201_HALO                NO
TEST_TAB1                      P_20100201_JUNE                NO
TEST_TAB1                      P_20100201_OTHERS              NO

<<截短输出>>

TEST_TAB1                      P_20100206_GROT                NO
TEST_TAB1                      P_20100206_HALO                NO
TEST_TAB1                      P_20100206_JUNE                NO
TEST_TAB1                      P_20100206_OTHERS              NO
TEST_TAB1                      P_20100207_GROT                NO
TEST_TAB1                      P_20100207_HALO                NO
TEST_TAB1                      P_20100207_JUNE                NO
TEST_TAB1                      P_20100207_OTHERS              NO

32 rows selected.

这时并没有子分区上的统计信息是我们预期的行为。当加载数据后，我们会收集它们的统计信息。我打算在这个阶段跳过列的统计信息，因为我并不需要它们来演示问题。让我们想象一下在现实的系统中，我们刚刚创建了上面的分区，并考虑向P_20100206_GROT子分区中加载数据


SQL> INSERT INTO TEST_TAB1 VALUES (20100206, 'GROT', 100000, 'P');

1 row created.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, 'GROT', 3000000, 'P');

1 row created.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, 'GROT', 200000, 'P');

1 row created.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, 'GROT', 110000, 'P');

1 row created.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, 'GROT', 240000, 'U');

1 row created.

SQL> COMMIT;

Commit complete.

接下来，我们调用正常的统计信息收集方法，在刚刚加载数据的子分区上收集统计信息。

SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION', 
                                         PARTNAME => 'P_20100206_GROT');

PL/SQL procedure successfully completed.

注意，这里需要指出的，在测试脚本中，我在空表上收集统计信息与在新子分区上插入并收集统计信息之间，放置了一个短暂的暂停。以便你注意到此时LAST_ANALYZED上的值。
那么，统计信息看起来是如何的呢？

SQL> select     table_name, global_stats, last_analyzed, num_rows
  2  from dba_tables
  3  where table_name='TEST_TAB1'
  4  and owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     GLO LAST_ANALYZED          NUM_ROWS
------------------------------ --- -------------------- ----------
TEST_TAB1                      NO  28-FEB-2010 08:06:25          0

嗯，我注意到LAST_ANALYZED的值更新了，但表级的NUM_ROWS仍然为0。分区上的统计信息又会如何呢？

SQL> select     table_name, partition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_partitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     PARTITION_NAME                 GLO LAST_ANALYZED          NUM_ROWS
------------------------------ ------------------------------ --- -------------------- ----------
TEST_TAB1                      P_20100131                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100201                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100202                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100203                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100204                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100205                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100206                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100207                     YES 28-FEB-2010 08:04:24          0

8 rows selected.

仅从表和分区上的统计信息来看，这个表仍然是空的。这可不好，我可以想像到不久的将来，带有基数为1和笛卡尔积的执行计划就会出现。注意分区上的LAST_ANALYZED的值，可以看到该时间戳并没有改变，这是出现错误的另一个迹象。
我需要检查一下子分区上统计信息是否正确收集了。


SQL> select     table_name, subpartition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_subpartitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     SUBPARTITION_NAME              GLO LAST_ANALYZED          NUM_ROWS
------------------------------ ------------------------------ --- -------------------- ----------
TEST_TAB1                      P_20100131_GROT                NO
TEST_TAB1                      P_20100131_HALO                NO
TEST_TAB1                      P_20100131_JUNE                NO
TEST_TAB1                      P_20100131_OTHERS              NO
TEST_TAB1                      P_20100201_GROT                NO
TEST_TAB1                      P_20100201_HALO                NO
TEST_TAB1                      P_20100201_JUNE                NO
TEST_TAB1                      P_20100201_OTHERS              NO

<<output snipped>>

TEST_TAB1                      P_20100206_GROT                YES 28-FEB-2010 08:06:25          5
TEST_TAB1                      P_20100206_HALO                NO
TEST_TAB1                      P_20100206_JUNE                NO
TEST_TAB1                      P_20100206_OTHERS              NO
TEST_TAB1                      P_20100207_GROT                NO
TEST_TAB1                      P_20100207_HALO                NO
TEST_TAB1                      P_20100207_JUNE                NO
TEST_TAB1                      P_20100207_OTHERS              NO

32 rows selected.

哈，可能这就是问题所在。只有P_20100206分区中的一个子分区具备有效的统计信息，所以，Oracle不能在更高层级上产生全局统计信息。因此，我会尝试在表的所有子分区上收集统计信息来修复这个问题（实际上，我只需要收集P_20100206中剩余其它子分区的统计信息，但为了简便，我使用了这种方法）。

SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION');

PL/SQL procedure successfully completed.

现在，我们来检查一下全部子分区都有了有效的统计信息。

SQL> select     table_name, subpartition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_subpartitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     SUBPARTITION_NAME              GLO LAST_ANALYZED          NUM_ROWS
------------------------------ ------------------------------ --- -------------------- ----------
TEST_TAB1                      P_20100131_GROT                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100131_HALO                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100131_JUNE                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100131_OTHERS              YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100201_GROT                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100201_HALO                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100201_JUNE                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100201_OTHERS              YES 28-FEB-2010 08:06:25          0

<<截短输出>>

TEST_TAB1                      P_20100206_GROT                YES 28-FEB-2010 08:06:25          5
TEST_TAB1                      P_20100206_HALO                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100206_JUNE                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100206_OTHERS              YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100207_GROT                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100207_HALO                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100207_JUNE                YES 28-FEB-2010 08:06:25          0
TEST_TAB1                      P_20100207_OTHERS              YES 28-FEB-2010 08:06:25          0

32 rows selected.

好了，现在Oracle应该可以聚合子分区上的统计信息，在分区和表级别上生成全局统计信息了。


SQL> select     table_name, global_stats, last_analyzed, num_rows
  2  from dba_tables
  3  where table_name='TEST_TAB1'
  4  and owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     GLO LAST_ANALYZED          NUM_ROWS
------------------------------ --- -------------------- ----------
TEST_TAB1                      NO  28-FEB-2010 08:06:25          0

SQL> 
SQL> select     table_name, partition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_partitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     PARTITION_NAME                 GLO LAST_ANALYZED          NUM_ROWS
------------------------------ ------------------------------ --- -------------------- ----------
TEST_TAB1                      P_20100131                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100201                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100202                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100203                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100204                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100205                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100206                     YES 28-FEB-2010 08:04:24          0
TEST_TAB1                      P_20100207                     YES 28-FEB-2010 08:04:24          0

8 rows selected.

依据表和分区上的统计信息，表仍然是空的，并且这些分区统计信息仍然没有被更新。
问题是Oracle不会使用聚合的全局统计信息覆盖真正的全局统计信息，仔细想想，这是一种明智的方法，因为如果我有一个收集表和分区统计数据的策略（即第一篇文章中提到的Oracle推荐策略），那么我最不希望的是，当在子分区上收集统计数据时，这些全局统计数据不断被聚合的统计数据（使用不正确的NDV）所覆盖！

我们在这里的错误应被视为以下两种错误的组合：
a）不遵循Oracle的建议（因为如果我们这样做，我们还要再使用单独的任务收集表和分区上的全局统计数据）；
b）一旦我们偏离该策略，在不正确级别上收集统计数据。我们收集的那些分区的统计数据永远不会被重写，除非在分区上再次收集统计数据，然后将其聚合到表级别。

考虑到我们想要（亦或是必须要）实现的，即使用当前的方法，我们应该仅需在子分区级别上收集统计信息，然后这些统计信息会被聚合到表和分区级别。
为了修复这一问题，我们删除了已存在的统计信息，以去除分区上的全局统计信息。然后做为一个一次性的练习，重新在子分区级别收集。


SQL> exec dbms_stats.delete_table_stats('TESTUSER', 'TEST_TAB1')

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats('TESTUSER', 'TEST_TAB1', GRANULARITY => 'SUBPARTITION');

PL/SQL procedure successfully completed.

一个重要改变是，现在，当我们在新的子分区上收集统计信息时，我们可以通过聚合过程，在表和分区级别上聚合统计信息了。检查表和分区上的统计信息。

SQL> select     table_name, global_stats, last_analyzed, num_rows
  2  from dba_tables
  3  where table_name='TEST_TAB1'
  4  and owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     GLO LAST_ANALYZED          NUM_ROWS
------------------------------ --- -------------------- ----------
TEST_TAB1                      NO  28-FEB-2010 08:06:26          5

SQL> 
SQL> select     table_name, partition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_partitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     PARTITION_NAME                 GLO LAST_ANALYZED          NUM_ROWS
------------------------------ ------------------------------ --- -------------------- ----------
TEST_TAB1                      P_20100131                     NO  28-FEB-2010 08:06:26          0
TEST_TAB1                      P_20100201                     NO  28-FEB-2010 08:06:26          0
TEST_TAB1                      P_20100202                     NO  28-FEB-2010 08:06:26          0
TEST_TAB1                      P_20100203                     NO  28-FEB-2010 08:06:26          0
TEST_TAB1                      P_20100204                     NO  28-FEB-2010 08:06:26          0
TEST_TAB1                      P_20100205                     NO  28-FEB-2010 08:06:26          0
TEST_TAB1                      P_20100206                     NO  28-FEB-2010 08:06:26          5
TEST_TAB1                      P_20100207                     NO  28-FEB-2010 08:06:26          0

8 rows selected

所有的分区统计信息已经被更新，并且是聚合的，而不是真正的全局统计信息。将统计信息处理的粒度由PARTITION改为SUBPARTITION的这一元数据的修改，确保统计信息的收集总是在子分区级别，并防止问题的再次发生。
你可能会争辩说，我们本可以通过使用默认的统计数据收集策略来避免所有这一切，而不是试图变得太聪明，但我们确实很难支持所需的额外工作负载。哦，这个例子让问题变得很明显，因为统计数据是在空分区上收集的，我们知道我们已经这样做了，并且相对容易地观察到0行的分区，但是想象一下，如果有人出于某种原因手动收集了分区的统计数据（在看似合理的情况下做出这个决定并不难），分区的行数有几百万左右，直到有人决定重复这个过程，这种状态将被冻结并永远保持吗？您是否真的注意到聚合过程，由于某种原因而不起作用了？

不管问题是否是我们自己造成的，只要我们发现了这个错误，我可以想象其他人如果不完全理解聚合过程，也会犯同样的错误。

在接下来的几个贴子中，我会介绍一些我们已经研究过的，Oracle引入的新方法。看一看他们是否可以帮助我们更好的收集全局统计信息，并减少收集统计信息的开销。（点此浏览《Oracle分区表上的统计信息（五）》）

原文链接：http://oracledoug.com/serendipity/index.php?/archives/1568-Statistics-on-Partitioned-Tables-Part-4.html
原文内容：
In the last post I illustrated the problems you can run into when you rely on Oracle to aggregate statistics on partitions or subpartitions to generate estimated Global Statistics at higher levels of the table. Until there are statistics for all of the relevant structures then aggregation won’t take place so, for example, if you have statistics for three out of four subpartitions, there won’t be any aggregated global statistics on the related partition until you gather statistics on the fourth subpartition. Randolf Geist left a comment describing how you might avoid problems with this.

“In order to solve the issue of adding partitions with initially missing statistics screwing up the aggregated statistics it was taken care that newly added subpartitions got their statistics immediately updated (with 0 rows in that case) - which didn’t take a lot of time since the subpartitions were empty and it solved the issue with the aggregated statistics.”

That’s what our system does, but we introduced a change in the last release that caused the problems that inspired this series of posts …

First let’s start with an empty table (definition hasn’t changed since the first post). Now, because we are so paranoid about partitions without stats, we’ll gather statistics at the PARTITION level even though the table is empty at the moment. I’m not going to specify a partition name here to cut the text back a bit, but on the real system we would have. Regardless, we’ll still see the same problematic end result.

SQL> exec dbms_stats.gather_table_stats(‘TESTUSER’, ‘TEST_TAB1’, GRANULARITY => ‘PARTITION’);

PL/SQL procedure successfully completed.

SQL> select table_name, global_stats, last_analyzed, num_rows
2 from dba_tables
3 where table_name=‘TEST_TAB1’
4 and owner=‘TESTUSER’
5 order by 1, 2, 4 desc nulls last;

TABLE_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 NO 28-FEB-2010 08:04:24 0

OK, so the table statistics aren’t true Global Statistics but that’s ok, we know about that. We also know that there’s no data in the table at this stage so the stats reflect that. When we look at the Partition level stats :-

SQL> select table_name, partition_name, global_stats, last_analyzed, num_rows
2 from dba_tab_partitions
3 where table_name=‘TEST_TAB1’
4 and table_owner=‘TESTUSER’
5 order by 1, 2, 4 desc nulls last;

TABLE_NAME PARTITION_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 P_20100131 YES 28-FEB-2010 08:04:24 0
TEST_TAB1 P_20100201 YES 28-FEB-2010 08:04:24 0
TEST_TAB1 P_20100202 YES 28-FEB-2010 08:04:24 0
TEST_TAB1 P_20100203 YES 28-FEB-2010 08:04:24 0
TEST_TAB1 P_20100204 YES 28-FEB-2010 08:04:24 0
TEST_TAB1 P_20100205 YES 28-FEB-2010 08:04:24 0
TEST_TAB1 P_20100206 YES 28-FEB-2010 08:04:24 0
TEST_TAB1 P_20100207 YES 28-FEB-2010 08:04:24 0

8 rows selected.

They are true global statistics, albeit on no data at this stage, but at least we have some statistics to reflect that. Looking at the Subpartition stats :-

SQL> select table_name, subpartition_name, global_stats, last_analyzed, num_rows
2 from dba_tab_subpartitions
3 where table_name=‘TEST_TAB1’
4 and table_owner=‘TESTUSER’
5 order by 1, 2, 4 desc nulls last;

TABLE_NAME SUBPARTITION_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 P_20100131_GROT NO
TEST_TAB1 P_20100131_HALO NO
TEST_TAB1 P_20100131_JUNE NO
TEST_TAB1 P_20100131_OTHERS NO
TEST_TAB1 P_20100201_GROT NO
TEST_TAB1 P_20100201_HALO NO
TEST_TAB1 P_20100201_JUNE NO
TEST_TAB1 P_20100201_OTHERS NO

TEST_TAB1 P_20100206_GROT NO
TEST_TAB1 P_20100206_HALO NO
TEST_TAB1 P_20100206_JUNE NO
TEST_TAB1 P_20100206_OTHERS NO
TEST_TAB1 P_20100207_GROT NO
TEST_TAB1 P_20100207_HALO NO
TEST_TAB1 P_20100207_JUNE NO
TEST_TAB1 P_20100207_OTHERS NO

32 rows selected.

No subpartition stats at all at this stage which is expected behaviour and we’ll be gathering them later after we load the data. I’m going to skip the column statistics at this stage because I don’t need them to illustrate the problem. So let’s imagine that on the live system we’ve just created the partitions above and are about to load data into the P_20100206_GROT subpartition.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, ‘GROT’, 100000, ‘P’);

1 row created.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, ‘GROT’, 3000000, ‘P’);

1 row created.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, ‘GROT’, 200000, ‘P’);

1 row created.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, ‘GROT’, 110000, ‘P’);

1 row created.

SQL> INSERT INTO TEST_TAB1 VALUES (20100206, ‘GROT’, 240000, ‘U’);

1 row created.

SQL> COMMIT;

Commit complete.

Next our normal stats gathering approach is invoked and we gather stats on the subpartition just loaded.

SQL> exec dbms_stats.gather_table_stats(‘TESTUSER’, ‘TEST_TAB1’, GRANULARITY => ‘SUBPARTITION’,
PARTNAME => ‘P_20100206_GROT’);

PL/SQL procedure successfully completed.

N.B. It’s probably worth pointing out at this stage that I put a short pause in the test script between the original stats gathering on the empty table and the INSERTs and gather on the newly-loaded subpartition so you might want to pay attention to the LAST_ANALYZED values here.

So how do the stats look?

SQL> select table_name, global_stats, last_analyzed, num_rows
2 from dba_tables
3 where table_name=‘TEST_TAB1’
4 and owner=‘TESTUSER’
5 order by 1, 2, 4 desc nulls last;

TABLE_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 NO 28-FEB-2010 08:06:25 0

Mmmmmm … I can see that the LAST_ANALYZED time has been updated, but NUM_ROWS is still 0 at the table level. How about the partitions?

TABLE_NAME PARTITION_NAME GLO LAST_ANALYZED NUM_ROWS

8 rows selected.

So, as far as the Table and Partition Statistics look, this table is still empty! That’s not good and I can imagine a near future of execution plans with CARDINALITY=1 and MERGE JOIN CARTESIAN. Looking at the LAST_ANALYSED values on the Partitions, I can see that the timestamp hasn’t changed, which is another sign that something is wrong.

I’ll check that the subpartition stats were gathered correctly.

TABLE_NAME SUBPARTITION_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 P_20100206_GROT YES 28-FEB-2010 08:06:25 5
TEST_TAB1 P_20100206_HALO NO
TEST_TAB1 P_20100206_JUNE NO
TEST_TAB1 P_20100206_OTHERS NO
TEST_TAB1 P_20100207_GROT NO
TEST_TAB1 P_20100207_HALO NO
TEST_TAB1 P_20100207_JUNE NO
TEST_TAB1 P_20100207_OTHERS NO

32 rows selected.

Ah, perhaps that’s what the problem is. Only one of the P_20100206 subpartitions has valid stats so Oracle can not generate aggregated Global Stats at the higher levels of the table. So I’ll try to fix that by gathering statistic on all of the subpartitions in the table. (In fact, I only really need to gather stats on the remaining P_20100206 subpartitions but I’ll use this approach for brevity)

SQL> exec dbms_stats.gather_table_stats(‘TESTUSER’, ‘TEST_TAB1’, GRANULARITY => ‘SUBPARTITION’);

PL/SQL procedure successfully completed.

Let’s check that all of the subpartitions have valid statistics now.

TABLE_NAME SUBPARTITION_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 P_20100131_GROT YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100131_HALO YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100131_JUNE YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100131_OTHERS YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100201_GROT YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100201_HALO YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100201_JUNE YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100201_OTHERS YES 28-FEB-2010 08:06:25 0

TEST_TAB1 P_20100206_GROT YES 28-FEB-2010 08:06:25 5
TEST_TAB1 P_20100206_HALO YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100206_JUNE YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100206_OTHERS YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100207_GROT YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100207_HALO YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100207_JUNE YES 28-FEB-2010 08:06:25 0
TEST_TAB1 P_20100207_OTHERS YES 28-FEB-2010 08:06:25 0

32 rows selected.

OK, so Oracle should have aggregated the subpartition stats to generate global stats on the partitions and table.

SQL> select table_name, global_stats, last_analyzed, num_rows
2 from dba_tables
3 where table_name=‘TEST_TAB1’
4 and owner=‘TESTUSER’
5 order by 1, 2, 4 desc nulls last;

TABLE_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 NO 28-FEB-2010 08:06:25 0

SQL>
SQL> select table_name, partition_name, global_stats, last_analyzed, num_rows
2 from dba_tab_partitions
3 where table_name=‘TEST_TAB1’
4 and table_owner=‘TESTUSER’
5 order by 1, 2, 4 desc nulls last;

TABLE_NAME PARTITION_NAME GLO LAST_ANALYZED NUM_ROWS

8 rows selected.

So, according to the table and the partition stats, the table is still empty and those partition statistics still haven’t been updated!

The problem here is that Oracle won’t overwrite true global stats with aggregated global stats. When you think about it, that’s a sensible approach because if I have a strategy of collecting Table and Partition stats (i.e. the Oracle-recommended strategy covered in the first post) then the last thing I want is those global stats constantly being overwritten by aggregated stats (with incorrect NDVs) when stats are gathered on subpartitions!

Our mistake here could be viewed as a combination of a) not following Oracle recommendations (because if we did, we’d also be gathering global stats on the Table and Partitions using a seperate task and b) once we depart from that strategy, gathering stats at the incorrect level. Those Partition stats that we gathered can never be over-written except by gathering stats again on the Partitions, which would then be aggregated up to the table level.

Allowing for the fact we want to (have to?), use our current approach, we should only ever gather stats at the SUBPARTITION level which will then be aggregated up to the Table and the Partition level.

As for the fix, we deleted the existing stats, to rid the partitions of their global stats and then regathered at the SUBPARTITION level as a one-off exercise.

SQL> exec dbms_stats.delete_table_stats(‘TESTUSER’, ‘TEST_TAB1’)

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(‘TESTUSER’, ‘TEST_TAB1’, GRANULARITY => ‘SUBPARTITION’);

PL/SQL procedure successfully completed.

The important change is that we now have aggregated stats at both the Table and Partition levels which can then be updated by the aggregation process as we gather stats on new SUBPARTITIONS. Checking the statistics on the Table and Partitions …

SQL> select table_name, global_stats, last_analyzed, num_rows
2 from dba_tables
3 where table_name=‘TEST_TAB1’
4 and owner=‘TESTUSER’
5 order by 1, 2, 4 desc nulls last;

TABLE_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 NO 28-FEB-2010 08:06:26 5

TABLE_NAME PARTITION_NAME GLO LAST_ANALYZED NUM_ROWS

TEST_TAB1 P_20100131 NO 28-FEB-2010 08:06:26 0
TEST_TAB1 P_20100201 NO 28-FEB-2010 08:06:26 0
TEST_TAB1 P_20100202 NO 28-FEB-2010 08:06:26 0
TEST_TAB1 P_20100203 NO 28-FEB-2010 08:06:26 0
TEST_TAB1 P_20100204 NO 28-FEB-2010 08:06:26 0
TEST_TAB1 P_20100205 NO 28-FEB-2010 08:06:26 0
TEST_TAB1 P_20100206 NO 28-FEB-2010 08:06:26 5
TEST_TAB1 P_20100207 NO 28-FEB-2010 08:06:26 0

8 rows selected

All of the partition stats have been updated and are now aggregated rather than true global stats. A modification to the metadata that our stats process uses to change the granularity from PARTITION to SUBPARTITION will ensure stats are always gathered at the subpartition level and stop the problem from re-occuring.

You could argue that we could have avoided all of this by just using the default stats gathering strategy and not try to be too clever, but we would really struggle to support the required additional workload. Oh, and this example makes the problem obvious because the stats were gathered on empty partitions, we knew we’d done so and it was relatively easy to spot zero-row partitions, but imagine if someone gathered statistics on your partitions manually for some reason (it wouldn’t be difficult to decide that seemed sensible) and the row counts for partitions are several million or so, frozen and stuck that way forever until someone decides to repeat the process? Would you really notice the aggregation process wasn’t working for some reason?

Regardless of whether the problem is self-inflicted, as soon as we spotted this mistake, I could imagine others making the same mistake if they don’t understand the aggregation process fully.

In the next few posts I’ll look at some of the new approaches Oracle has introduced which we’ve investigated, to see if they can help us to gather better global statistics and/or reduce our stats-gathering workload.

partition optimizer statistics

最后修改时间：2021-12-18 16:55:27

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

Oracle分区表上的统计信息（四）

评论