暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

并行创建索引探析 - Create Index Concurrently

3196
      开源PostgreSQL爱好者,长期从事PostgreSQL数据库运维工作。
      默认创建索引过程中由于需要请求ShareLock级别的锁,将会阻塞对表的修改操作,为了降低索引创建的影响,在PG中支持并发索引创建,该操作请求的锁级别为ShareUpdateExclusiveLock级别的表,允许对表的插入/更新/删除操作。
      但是,并发创建索引也有自身需要注意的事项,由于不阻塞对表的修改操作,因此需要分多个步骤完成,这一点在手册上有详细说明,如下
In a concurrent index build, the index is actually entered into the system catalogs in one transaction, then two table scans occur in two more transactions. Before each table scan, the index build must wait for existing transactions that have modified the table to terminate. After the second scan, the index build must wait for any transactions that have a snapshot predating the second scan to terminate. Then finally the index can be marked ready for use, and the CREATE INDEX command terminates. Even then, however, the index may not be immediately usable for queries: in the worst case, it cannot be used as long as transactions exist that predate the start of the index build.
       并发创建索引需要分多个步骤完成,首先在一个事务中将相关索引信息记录到系统表中,但是将索引信息标记为非法状态,然后需要进行两次扫描,并且在最后需要等待第二次扫描之前产生的所有具有快照信息的事务结束,最后修改索引的状态信息为可用。
      以上是手册上给出的并发索引创建的描述,理解起来不太容易,结合下面的例子业务会更加容易理解其过程。
    create table test1(a int);
    create table test2(a int);
    Sessio1:
    Begin;
    Lock table test1;
    Session2:
    Begin;
    Insert into test1 values(10);
    Session3:
    Create index concurrently b_idx on test2(a);
          在上面的例子中,session3中并发索引创建将被阻塞,导致无法创建成功,阻塞的原因是前面描述中标注导致的。索引创建进程信息如下
      postgres=# select * from pg_stat_activity where pid = 3561;
      -[ RECORD 1 ]----+---------------------------------------------
      datid | 13665
      datname | postgres
      pid | 3561
      usesysid | 10
      usename | postgres
      application_name | psql
      wait_event_type | Lock
      wait_event | virtualxid
      state | active
      query | create index concurrently a_idx on test2(a);
             从上面可以看到索引创建在等待虚拟事务ID的释放,进一步通过pg_locks视图可以找到持有该锁的进程。
        postgres=#select relation,virtualxid,virtualtransaction,granted,pid from pg_locks where pid = 3561;
        relation | virtualxid | virtualtransaction | granted | pid
        ----------+------------+--------------------+---------+------
        | 5/8 | 5/8 | t | 3561
        16387 | | 5/8 | t | 3561
        | 4/2 | 5/8 | f | 3561
        (3 rows)
        postgres=# select relation,virtualxid,virtualtransaction,granted,pid from pg_locks where virtualxid='4/2' and granted = 't';
        relation | virtualxid | virtualtransaction | granted | pid
        ----------+------------+--------------------+---------+------
        | 4/2 | 4/2 | t | 3456
        (1 row)
              从上面结果可以看出,索引创建在等待进程3456的结束,对应操作为
          postgres=# select * from pg_stat_activity where pid = 3456;
          -[ RECORD 1 ]----+------------------------------
          datid | 13665
          datname | postgres
          pid | 3456
          usesysid | 10
          usename | postgres
          wait_event_type | Lock
          wait_event | relation
          state | active
          backend_xmin | 516
          query | insert into test1 values(10);
                 对应前面的session2操作,如果需要了解这其中的具体原因,需要仔细阅读索引创建函数DefineIndex,了解其中操作过程,下面给出整理的大致流程和等待进程的调用栈信息。
                 结合堆栈和流程图,可以看到,阻塞发生在第三阶段上,等待其他事务完成,是不是所有的事务都会阻塞呢?答案是否定的,索引创建的过程不是和所有事务都会阻塞,可以参考GetCurrentVirtualXIDs函数实现。
                 如果强制结束前面的索引创建进程,根据前面的流程图,该索引实际上已经记录在系统表中,只是状态标记为invalid,如下
                 如果发生这种情况,索引是无法使用到的,但是仍然会插入相关数据,影响更改的性能,需要重建该索引或者删除该索引后重新创建,手册说明如下: 
          If a problem arises while scanning the table, such as a deadlock or a uniqueness violation in a unique index, the CREATE INDEX command will fail but leave behind an “invalid” index. This index will be ignored for querying purposes because it might be incomplete; however it will still consume update overhead.
          The recommended recovery method in such cases is to drop the index and try again to perform CREATE INDEX CONCURRENTLY. (Another possibility is to rebuild the index with REINDEX INDEX CONCURRENTLY).
           

          扫描钉钉直播群二维码 关注我们免费看直播

          文章转载自PostgreSQL中文社区,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

          评论