idx_tup_fetch和idx_tup_read，傻傻分不清楚

原创 xiongcc 2024-05-17

672

前言

我们在分析索引效率的时候，时常会借助 pg_stat_all_indexes 这个视图，其中的 idx_scan 字段很好理解，索引扫描的次数，另外在 16 版本中还增加了 last_idx_scan 字段，可以让我们获取最近未使用过的索引列表。在以前的版本是无法直接查找在给定时间段内是否使用过某个索引的。在此视图中，还有两个关键字段：idx_tup_read 和 idx_tup_fetch，那么这两个字段有有何区别？

分析

官网上对于二者有简单介绍

idx_tup_fetch：Number of live table rows fetched by simple index scans using this index，通过索引扫描返回的活元组数量
idx_tup_read：Number of index entries returned by scans on this index，扫描索引返回的索引条目数量，每当读取索引条目时，其值就会增加。

The idx_tup_read and idx_tup_fetch counts can be different even without any use of bitmap scans, because idx_tup_read counts index entries retrieved from the index while idx_tup_fetch counts live rows fetched from the table. The latter will be less if any dead or not-yet-committed rows are fetched using the index, or if any heap fetches are avoided by means of an index-only scan.
即使不使用任何位图扫描，idx_tup_read 和 idx_tup_fetch 计数也可能不同，因为 idx_tup_read 计数从索引检索的索引条目，而 idx_tup_fetch 计数从表获取的活元组。如果使用索引获取任何死行或尚未提交的行，或者通过仅索引扫描避免任何堆获取，则后者会更少。

通过阅读官网，现在可以得出一个粗糙的结论：

死元组会影响到这两个指标的统计
index only scan 也会影响到二者的统计

让我们试一下

postgres=# create table t1(id int,info text);
CREATE TABLE
postgres=# create index on t1(id);
CREATE INDEX
postgres=# insert into t1 select n,md5(random()::text) from generate_series(1,1000) as n;
INSERT 0 1000
postgres=# analyze t1;
ANALYZE

postgres=# select * from pg_stat_all_indexes where relname = 't1';
-[ RECORD 1 ]-+----------
relid         | 167949
indexrelid    | 167954
schemaname    | public
relname       | t1
indexrelname  | t1_id_idx
idx_scan      | 0
last_idx_scan | 
idx_tup_read  | 0
idx_tup_fetch | 0

现在索引的各项指标皆为 0。先跑一下正常的 index scan

postgres=# explain analyze select * from t1 where id = 99;
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Index Scan using t1_id_idx on t1  (cost=0.28..8.29 rows=1 width=37) (actual time=0.026..0.027 rows=1 loops=1)
   Index Cond: (id = 99)
 Planning Time: 0.116 ms
 Execution Time: 0.050 ms
(4 rows)

postgres=# select * from pg_stat_all_indexes where relname = 't1';
-[ RECORD 1 ]-+------------------------------
relid         | 167949
indexrelid    | 167954
schemaname    | public
relname       | t1
indexrelname  | t1_id_idx
idx_scan      | 1
last_idx_scan | 2024-05-16 16:28:34.911803+08
idx_tup_read  | 1
idx_tup_fetch | 1

通过视图，我们可以知道：

索引扫描了一次，idx_scan = 1
读取了一行索引条目，idx_tup_read = 1
通过索引条目回表获取了一行数据，idx_tup_fetch = 1

现在让我们尝试使用 Index Only Scan：

postgres=# explain analyze select id from t1 where id = 99;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Index Only Scan using t1_id_idx on t1  (cost=0.28..8.29 rows=1 width=4) (actual time=0.033..0.035 rows=1 loops=1)
   Index Cond: (id = 99)
   Heap Fetches: 1
 Planning Time: 0.078 ms
 Execution Time: 0.053 ms
(5 rows)

postgres=# select * from pg_stat_all_indexes where relname = 't1';
-[ RECORD 1 ]-+------------------------------
relid         | 167949
indexrelid    | 167954
schemaname    | public
relname       | t1
indexrelname  | t1_id_idx
idx_scan      | 2
last_idx_scan | 2024-05-16 16:55:31.973902+08
idx_tup_read  | 2
idx_tup_fetch | 2

有点意外的是，idx_tup_read 和 idx_tup_fetch 都增加了，原因不难理解，虽然执行计划显式的是 Index Only Scan，但是由于没有 VM 文件的辅助，执行器还是需要回表去判断数据的可见性 (Heap Fetches: 1)，因此，让我们清理一下，生成 VM 文件。

postgres=# vacuum t1;
VACUUM
postgres=# explain analyze select id from t1 where id = 99;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Index Only Scan using t1_id_idx on t1  (cost=0.28..4.29 rows=1 width=4) (actual time=0.021..0.021 rows=1 loops=1)
   Index Cond: (id = 99)
   Heap Fetches: 0
 Planning Time: 0.133 ms
 Execution Time: 0.040 ms
(5 rows)

postgres=# select * from pg_stat_all_indexes where relname = 't1';
-[ RECORD 1 ]-+------------------------------
relid         | 167949
indexrelid    | 167954
schemaname    | public
relname       | t1
indexrelname  | t1_id_idx
idx_scan      | 3
last_idx_scan | 2024-05-16 17:00:51.092246+08
idx_tup_read  | 3
idx_tup_fetch | 2

不出所料，这一次 idx_tup_fetch 没有增加，说明读取的索引条目就满足查询，没有再次回表。

由于索引不包含可见性信息，假如新插入了一条"满足"查询条件的数据，行为会怎样？新开一个会话，但是并不提交：

postgres=# begin;
BEGIN
postgres=*# insert into t1 values(199999,'postgres');
INSERT 0 1
postgres=*#

然后查询这条尚未提交的数据，执行计划显式，执行器确实去扫描了这一行索引，并回表去判断了可见性

postgres=# explain analyze select id from t1 where id = 199999;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Index Only Scan using t1_id_idx on t1  (cost=0.28..4.29 rows=1 width=4) (actual time=0.025..0.026 rows=0 loops=1)
   Index Cond: (id = 199999)
   Heap Fetches: 1
 Planning Time: 0.084 ms
 Execution Time: 0.047 ms
(5 rows)

postgres=# select * from pg_stat_all_indexes where relname = 't1';
-[ RECORD 1 ]-+------------------------------
relid         | 167949
indexrelid    | 167954
schemaname    | public
relname       | t1
indexrelname  | t1_id_idx
idx_scan      | 4
last_idx_scan | 2024-05-16 17:07:39.265043+08
idx_tup_read  | 4
idx_tup_fetch | 2

同样的，idx_tup_fetch 也没有增加，因为并没有通过索引返回实际的行数。

postgres=# select id from t1 where id = 199999;
 id 
----
(0 rows)

现在让我们回滚这条事务，并再次查询

postgres=# begin;
BEGIN
postgres=*# insert into t1 values(199999,'postgres');
INSERT 0 1
postgres=*# rollback ;  ---回滚事务
ROLLBACK

postgres=# explain analyze select id from t1 where id = 199999;   ---第一次扫描
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Index Only Scan using t1_id_idx on t1  (cost=0.28..4.29 rows=1 width=4) (actual time=0.029..0.029 rows=0 loops=1)
   Index Cond: (id = 199999)
   Heap Fetches: 1
 Planning Time: 0.076 ms
 Execution Time: 0.067 ms
(5 rows)

postgres=# explain analyze select id from t1 where id = 199999;   ---第二次扫描
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Index Only Scan using t1_id_idx on t1  (cost=0.28..4.29 rows=1 width=4) (actual time=0.030..0.030 rows=0 loops=1)
   Index Cond: (id = 199999)
   Heap Fetches: 0
 Planning Time: 0.080 ms
 Execution Time: 0.050 ms
(5 rows)

postgres=# select * from pg_stat_all_indexes where relname = 't1';
-[ RECORD 1 ]-+------------------------------
relid         | 167949
indexrelid    | 167954
schemaname    | public
relname       | t1
indexrelname  | t1_id_idx
idx_scan      | 6
last_idx_scan | 2024-05-16 17:10:46.339602+08
idx_tup_read  | 6
idx_tup_fetch | 2

postgres=# select * from pg_stat_all_indexes where relname = 't1';
-[ RECORD 1 ]-+------------------------------
relid         | 167949
indexrelid    | 167954
schemaname    | public
relname       | t1
indexrelname  | t1_id_idx
idx_scan      | 7
last_idx_scan | 2024-05-16 17:11:03.662315+08
idx_tup_read  | 6
idx_tup_fetch | 2

这一次行为更要有趣一些：

事务回滚，那么 199999 这行数据就变成了死元组，第一次扫描到这条死元组，但是索引不包含可见性信息，因此回表判断，因此 idx_scan + 1 (扫描了一次索引)，idx_tup_read + 1 (扫描了一行索引条目)
第二次扫描的时候，得益于第一次索引扫描将指针的状态修改为了 dead，第二次数据库已经知道这条索引元组失效了，应该忽略 (PostgreSQL 也不会笨到每次都反复去扫描这一条死元组)，所以只是扫描了索引，idx_scan + 1，但是 idx_tup_read 并没有变。

pg_stat_database

在 pg_stat_database 视图中，也有类似指标：tup_fetched 和 tup_returned，这块在源码中有注释

/* ----------
 * PgStat_TableCounts   The actual per-table counts kept by a backend
 *
 * This struct should contain only actual event counters, because we memcmp
 * it against zeroes to detect whether there are any stats updates to apply.
 * It is a component of PgStat_TableStatus (within-backend state).
 *
 * Note: for a table, tuples_returned is the number of tuples successfully
 * fetched by heap_getnext, while tuples_fetched is the number of tuples
 * successfully fetched by heap_fetch under the control of bitmap indexscans.
 * For an index, tuples_returned is the number of index entries returned by
 * the index AM, while tuples_fetched is the number of tuples successfully
 * fetched by heap_fetch under the control of simple indexscans for this index.
 *
 * tuples_inserted/updated/deleted/hot_updated/newpage_updated count attempted
 * actions, regardless of whether the transaction committed.  delta_live_tuples,
 * delta_dead_tuples, and changed_tuples are set depending on commit or abort.
 * Note that delta_live_tuples and delta_dead_tuples can be negative!

tuples_returned 是由 heap_getnext 成功获取的元组数量，而 tuples_fetched 是由位图索引扫描的 heap_fetch 成功获取的元组数量。对于索引，tuples_returned 是索引 AM 返回的索引项数量，而 tuples_fetched 是由控制此索引的简单索引扫描的 heap_fetch 成功获取的元组数量。

对于 OLTP，如果 tuples_returned 显著高于 tuples_fetched，意味着你可能缺失合适的索引。

SELECT
    t.tablename,
    indexname,
    c.reltuples::integer AS num_rows,
    pg_size_pretty(pg_relation_size(quote_ident(t.tablename)::text)) AS table_size,
    pg_size_pretty(pg_relation_size(quote_ident(indexrelname)::text)) AS index_size,
    CASE WHEN x.is_unique = 1  THEN 'Y'
       ELSE 'N'
    END AS UNIQUE,
    idx_scan AS number_of_scans,
    idx_tup_read AS tuples_read,
    idx_tup_fetch AS tuples_fetched
FROM pg_tables t
LEFT OUTER JOIN pg_class c ON t.tablename=c.relname
LEFT OUTER JOIN
       (SELECT indrelid,
           max(CAST(indisunique AS integer)) AS is_unique
       FROM pg_index
       GROUP BY indrelid) x
       ON c.oid = x.indrelid
LEFT OUTER JOIN
    ( SELECT c.relname AS ctablename, ipg.relname AS indexname, x.indnatts AS number_of_columns, idx_scan, idx_tup_read, idx_tup_fetch,indexrelname FROM pg_index x
           JOIN pg_class c ON c.oid = x.indrelid
           JOIN pg_class ipg ON ipg.oid = x.indexrelid
           JOIN pg_stat_all_indexes psai ON x.indexrelid = psai.indexrelid )
    AS foo
    ON t.tablename = foo.ctablename
WHERE t.schemaname='public'
ORDER BY pg_relation_size(quote_ident(indexrelname)::text) desc;

小结

idx_scan 只是代表扫描了这个索引，也就是执行计划中的 Index Scan 算子
idx_tup_read 表示从索引中读取了多少行条目，不管是死的还是活的
idx_tup_fetch 表示从表中读取了多少行数据，可能是需要回表判断可见性，也可能是投影列不在索引中

因此，如果 idx_tup_fetch 这个值很高，说明要么进行了大量回表，意味着 vacuum 需要更加频繁点，使得 VM 文件保持为较新状态；如果 idx_tup_read 显著高于 idx_tup_fetch，说明要么 index only scan 跑得不错，要么可能索引中存在大量死元组，膨胀了。对于 OLTP 系统，如果 tuples_returned 显著高于 tuples_fetched，意味着你可能缺失合适的索引。

postgresql

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

idx_tup_fetch和idx_tup_read，傻傻分不清楚

前言

分析

pg_stat_database

小结

评论