现象
上周生产中有个同步程序出错,同步中断,查看日志报出下面的错误:
ERROR,XX000,"right sibling's left-link doesn't match: block 53485 links to 31332 instead of expected 31324 in index ""xxx""",
故障修复
修复故障比较简单,直接重建改索引即可。
create index concurrently on xx(xx);drop index concurrently corr_idex;
当使用到该索引时,就会报出上面的错误,不管是 insert 还是 update 操作。
问题原因
•底层存储损坏•非 pg 相关的程序覆盖了部分文件内容
对于存储损坏的情况,需要考虑备份数据,更换集群重新导入或是有主备的情况,切换主备,因为确认是存储的问题,即使修复索引,后面出现问题也是早晚的事情,后面甚至可能出现表记录写入坏块,此时记录就丢失了。
涉及到的代码是: src/backend/access/nbtree/nbtpage.c
/** Check that the parent-page index items we're about to delete/overwrite* in subtree parent page contain what we expect. This can fail if the* index has become corrupt for some reason. We want to throw any error* before entering the critical section --- otherwise it'd be a PANIC.*/page = BufferGetPage(subtreeparent);opaque = (BTPageOpaque) PageGetSpecialPointer(page);#ifdef USE_ASSERT_CHECKING/** This is just an assertion because _bt_lock_subtree_parent should have* guaranteed tuple has the expected contents*/itemid = PageGetItemId(page, poffset);itup = (IndexTuple) PageGetItem(page, itemid);Assert(BTreeTupleGetDownLink(itup) == topparent);#endifnextoffset = OffsetNumberNext(poffset);itemid = PageGetItemId(page, nextoffset);itup = (IndexTuple) PageGetItem(page, itemid);if (BTreeTupleGetDownLink(itup) != topparentrightsib)ereport(ERROR,(errcode(ERRCODE_INDEX_CORRUPTED),errmsg_internal("right sibling %u of block %u is not next child %u of block %u in index \"%s\"",topparentrightsib, topparent,BTreeTupleGetDownLink(itup),BufferGetBlockNumber(subtreeparent),RelationGetRelationName(rel))));/** Any insert which would have gone on the leaf block will now go to its* right sibling. In other words, the key space moves right.*/PredicateLockPageCombine(rel, leafblkno, leafrightsib);/* No ereport(ERROR) until changes are logged */START_CRIT_SECTION();
文章转载自DB那些事儿,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




