暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 18 preview - tuple deformation 优化, OLAP 场景性能提升5-20%

铃木园子 2025-01-16
337

PostgreSQL 18 preview - tuple deformation 优化, OLAP 场景性能提升5-20%

作者

digoal

日期

2024-12-30

标签

PostgreSQL , PolarDB , DuckDB , tuple deformation , OLAP


背景

OLAP场景的特征之一是一个SQL请求可能要处理大量tuple其中的某些列, 对于行式存储来说tuple deformation的开销可能会比较大.

PostgreSQL 18 对tuple deformation进行优化, OLAP 场景性能提升5-20%

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=58a359e585d0281ecab4d34cab9869e7eb4e4ca3

Speedup tuple deformation with additional function inlining  
author  David Rowley <drowley@postgresql.org>   
Fri, 27 Dec 2024 23:20:42 +0000 (12:20 +1300)  
committer David Rowley <drowley@postgresql.org>   
Fri, 27 Dec 2024 23:20:42 +0000 (12:20 +1300)  
commit  58a359e585d0281ecab4d34cab9869e7eb4e4ca3  
tree  41a9d2725f0a64223cdc5636b6f1f9f573af6407  tree  
parent  d85ce012f99f63249bb45a78fcecb7a2383920b1  commit | diff  
Speedup tuple deformation with additional function inlining  
  
This adjusts slot_deform_heap_tuple() to add special-case loops to  
eliminate much of the branching that was done within the body of the  
main deform loop.  
  
Previously, while looping over each attribute to deform,  
slot_deform_heap_tuple() would always recheck if the given attribute was  
NULL by looking at HeapTupleHasNulls() and if so, went on to check the  
tuple's NULL bitmap.  Since many tuples won't contain any NULLs, we can  
just check HeapTupleHasNulls() once and when there are no NULLs, use a  
more compact version of the deforming loop which contains no NULL checking  
code at all.  
  
The same is possible for the "slow" mode checking part of the loop.  That  
variable was checked several times for each attribute, once to determine  
if the offset to the attribute value could be taken from the attcacheoff,  
and again to check if the offset could be cached for next time.  
  
These "slow" checks can mostly be eliminated by instead having multiple  
loops.  Initially, we can start in the non-slow loop and break out of  
that loop if and only if we must stop caching the offset.  This  
eliminates branching for both slow and non-slow deforming methods.  The  
amount of code required for the no nulls / non-slow version is very  
small.  It's possible to have separate loops like this due to the fact  
that once we move into slow mode, we never need to switch back into  
non-slow mode for a given tuple.  
  
We have the compiler take care of writing out the multiple required  
loops by having a pg_attribute_always_inline function which gets called  
various times passing in constant values for the "slow" and "hasnulls"  
parameters.  This allows the compiler to eliminate const-false branches  
and remove comparisons for const-true ones.  
  
This commit has shown overall query performance increases of around 5-20%  
in deform-heavy OLAP-type workloads.  
  
Author: David Rowley  
Reviewed-by: Victor Yegorov  
Discussion: https://postgr.es/m/CAGnEbog92Og2CpC2S8=g_HozGsWtt_3kRS1sXjLz0jKSoCNfLw@mail.gmail.com  
Discussion: https://postgr.es/m/CAApHDvo9e0XG71WrefYaRv5n4xNPLK4k8LjD0mSR3c9KR2vi2Q@mail.gmail.com  

期望 PostgreSQL|开源PolarDB 增加什么功能?

PolarDB 开源数据库

PolarDB 学习图谱

PostgreSQL 解决方案集合

德哥 / digoal’s Github - 公益是一辈子的事.

About 德哥

digoal's wechat

文章转载自铃木园子,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论