暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 13 preview - NOT IN -> anti join 性能优化

digoal 2020-01-01
368

作者

digoal

日期

2020-01-01

标签

PostgreSQL , NOT IN , ANTI JOIN


背景

not in 转换为 anti join

https://commitfest.postgresql.org/26/2023/

https://www.postgresql.org/message-id/flat/1550706289606-0.post@n3.nabble.com

```
Hey, here is our latest patch. Major changes in this patch include:
1. Use the original hashed subplan if the inner fits in memory as decided by subplan_is_hashable().
2. Fixed the inner relation empty case by adding an inner relation existence check when we pull x out as a filter on the outer (see details below).
3. Integrate David Rowley's routine to use strict predicates and inner join conditions when checking nullability of a Var.

Detailed description of the patch:

NOT IN to ANTI JOIN transformation

If the NOT IN subquery is not eligible for hashed subplan as decided by  
subplan_is_hashable(), do the following NOT IN to ANTI JOIN  
transformation:

Single expression:  
    When x is nullable:  
    t1.x not in (t2.y where p) =>  
    ANTI JOIN  
    t1 (Filter: t1.x IS NOT NULL or NOT EXISTS (select 1 from t2 where p)),  
    t2 on join condition (t1.x=t2.y or t2.y IS NULL)  
    and p.  
    The predicate "t2.y IS NULL" can be removed if y is non-nullable.

    When x is non-nullable:  
    t1.x not in (t2.y where p) =>  
    ANTI JOIN t1, t2 on join condition (t1.x=t2.y or t2.y IS NULL)  
    and p.  
    The predicate "t2.y IS NULL" can be removed if y is non-nullable.

Multi expression:  
    If all xi's are nullable:  
    (x1, x2, ... xn) not in (y1, y2, ... yn ) =>  
    ANTI JOIN t1, t2 on join condition:  
    ((t1.x1 = t2.y1) and ... (t1.xi = t2.yi) ... and  
     (t1.xn = t2.yn)) IS NOT FALSE.

    If at least one xi is non-nuallable:  
    (x1, x2, ... xn) not in (y1, y2, ... yn ) =>  
    ANTI JOIN t1, t2 on join condition:  
    (t1.x1 = t2.y1 or t2.y1 IS NULL or t1.x1 IS NULL) and ...  
    (t1.xi = t2.yi or t2.yi IS NULL) ... and  
    (t1.xn = t2.yn or t2.yn IS NULL or t1.xn IS NULL).

Add nullability testing routine is_node_nonnullable(), currently it  
handles Var, TargetEntry, CoalesceExpr and Const. It uses strict  
predicates, inner join conditions and NOT NULL constraint to check  
the nullability of a Var.

Adjust and apply reduce_outer_joins() before the transformation so  
that the outer joins have an opportunity to be converted to inner joins  
prior to the transformation.

We measured performance improvements of two to five orders of magnitude  
on most queries in a development environment. In our performance experiments,  
table s (small) has 11 rows, table l (large) has 1 million rows. s.n and l.n  
have NULL value. s.nn and l.nn are NOT NULL. Index is created on each column.

Cases using hash anti join:  
l.nn not in (l.nn) 21900s -> 235ms  
l.nn not in (l.nn where u>0) 22000s -> 240ms  
l.n not in (l.nn) 21900s -> 238ms  
l.n not in (l.nn where u>0) 22000s -> 248ms

Cases using index nested loop anti join  
s.n not in (l.nn) 360ms -> 0.5ms  
s.n not in (l.nn where u>0) 390ms -> 0.6ms  
s.nn not in (l.nn) 365ms -> 0.5ms  
s.nn not in (l.nn where u>0) 390ms -> 0.5ms

Cases using bitmap heap scan on the inner and nested loop anti join:  
s.n not in (l.n) 360ms -> 0.7ms  
l.n not in (l.n) 21900s ->  1.6s  
l.n not in (l.n where u>0) 22000s -> 1680ms  
s.nn not in (l.n) 360ms -> 0.5ms  
l.nn not in (l.n) 21900s -> 1650ms  
l.nn not in (l.n where u>0) 22000s -> 1660ms

Cases using the original hashed subplan:  
l.n not in (s.n) 63ms -> 63ms  
l.nn not in (s.nn) 63ms -> 63ms  
l.n not in (s.n where u>0) 63ms -> 63ms

Comments are welcome.

Regards,

Zheng Li
AWS, Amazon Aurora PostgreSQL
```

参考

https://www.postgresql.org/message-id/flat/20190621185207.GA27929@alvherre.pgsql

https://commitfest.postgresql.org/26/2164/

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论