原文地址:Ordinary in function queries
原文作者:Luca Ferrari
PostgreSQLSELECT语句允许您查询返回结果集(aSET OF或TABLE)的函数,这些函数用作查询本身的元组源。
但是,SELECT当针对提供结果集的函数调用该语句时,会出现一个额外的子句:. 此子句将一列添加到结果集中,其分子(类型为)表示从函数获得的元组的数量。 为什么这很重要?因为你不需要你的函数自己提供一种元组分子。
WITH ORDINALITY在行动
让我们举一个简单的例子来了解它是如何工作的。让我们创建一个返回表的函数:
CREATE OR REPLACE FUNCTION animals( l int DEFAULT 5,
animal text DEFAULT 'cat',
owner text DEFAULT 'nobody' )
RETURNS TABLE( pk int, description text, mood text )
AS $CODE$
DECLARE
i int := 0;
j int := 0;
BEGIN
FOR i IN 1 .. l LOOP
pk := i;
description := format( '%s #%s owned by %s', animal, i, owner );
j := random() * 100;
IF j % 2 = 0 THEN
mood := 'good';
ELSE
mood := 'bad';
END IF;
RAISE DEBUG 'Generating % # % with mood %', animal, i, mood;
RETURN NEXT;
END LOOP;
RETURN;
END
$CODE$
LANGUAGE plpgsql;
上面的函数animals()产生一个输出,其中包含动物的简单名称(数字)、生成的元组的索引(即分子)和随机选择的情绪。
测试它显然很容易:
testdb=> SELECT * FROM animals();
pk | description | mood
----+------------------------+------
1 | cat #1 owned by nobody | bad
2 | cat #2 owned by nobody | good
3 | cat #3 owned by nobody | bad
4 | cat #4 owned by nobody | bad
5 | cat #5 owned by nobody | bad
(5 rows)
该pk列包含生成的元组的分子,因此我们知道该cat #1元组是首先生成的,然后cat #2是第二个,依此类推。
让我们开始吧WITH ORDINALITY:
testdb=> SELECT * FROM animals() WITH ORDINALITY;
pk | description | mood | ordinality
----+------------------------+------+------------
1 | cat #1 owned by nobody | good | 1
2 | cat #2 owned by nobody | bad | 2
3 | cat #3 owned by nobody | good | 3
4 | cat #4 owned by nobody | good | 4
5 | cat #5 owned by nobody | good | 5
该WITH ORDINALITY子句必须遵循它将应用到的功能。这样的子句将一个新列附加到结果集中,默认情况下以ordinality渐进式分子命名。请注意如何包含相同pk的ordinality值:WITH ORDINALITY为您跟踪结果集流(函数)生成的元组,因此您不需要自己计算。
显然,这也适用于元组的重新排序,因为该子句不计算元组的外观,而是立即(或更好地,序列)将元组添加到结果集中:
testdb=> SELECT * FROM animals() WITH ORDINALITY
ORDER BY random();
pk | description | mood | ordinality
----+------------------------+------+------------
4 | cat #4 owned by nobody | good | 4
2 | cat #2 owned by nobody | good | 2
3 | cat #3 owned by nobody | good | 3
5 | cat #5 owned by nobody | good | 5
1 | cat #1 owned by nobody | good | 1
(5 rows)
也可以ordinality使用别名重命名列,如下所示:
AS cat(i, name, mood, n)
ORDER BY random();
i | name | mood | n
---+------------------------+------+---
4 | cat #4 owned by nobody | good | 4
1 | cat #1 owned by nobody | good | 1
2 | cat #2 owned by nobody | bad | 2
5 | cat #5 owned by nobody | bad | 5
3 | cat #3 owned by nobody | good | 3
(5 rows)
显然,您必须为整个结果集设置别名,而不是单个列!
WITH ORDINALITY作为过滤条件
拥有自动命名的ordinality列或自定义选择的命名列,可以将此类列添加到WHERE查询的子句中:
testdb=> SELECT * FROM animals() WITH ORDINALITY AS cat(i, name, mood, n)
WHERE n % 2 = 0
ORDER BY random();
i | name | mood | n
---+------------------------+------+---
4 | cat #4 owned by nobody | bad | 4
2 | cat #2 owned by nobody | bad | 2
(2 rows)
如您所见,上述查询对n列进行过滤以仅获取偶数元组。
WITH ORDINALITY对比row_number()
您可能会认为窗口函数与 的作用相同,至少在函数调用场景中是这样。但是,窗口函数是一个不同的野兽,它可以在针对结果集序数定义的窗口上工作。简而言之,窗口函数涵盖了一系列不同的问题! 因此,即使以下似乎产生了完全相同的结果: [row_number()](https://www.postgresql.org/docs/14/functions-window.html){:target="_blank"}WITH ORDINALITYrow_number()
testdb=> SELECT *, row_number() OVER () FROM animals() WITH ORDINALITY;
pk | description | mood | ordinality | row_number
----+------------------------+------+------------+------------
1 | cat #1 owned by nobody | good | 1 | 1
2 | cat #2 owned by nobody | bad | 2 | 2
3 | cat #3 owned by nobody | bad | 3 | 3
4 | cat #4 owned by nobody | good | 4 | 4
5 | cat #5 owned by nobody | bad | 5 | 5
(5 rows)
一旦您以更专业的方式将分区定义为编号,您就会看到不同的结果:
testdb=> SELECT *, row_number() OVER ( order by pk desc ) FROM animals() WITH ORDINALITY;
pk | description | mood | ordinality | row_number
----+------------------------+------+------------+------------
5 | cat #5 owned by nobody | bad | 5 | 1
4 | cat #4 owned by nobody | bad | 4 | 2
3 | cat #3 owned by nobody | bad | 3 | 3
2 | cat #2 owned by nobody | good | 2 | 4
1 | cat #1 owned by nobody | good | 1 | 5
在上面,您可以看到函数生成的最后一行(thisordinality设置为5)是 遇到的第一行row_number()。
加入时可以快速得到不同结果的另一个例子:
estdb=> SELECT *, row_number() OVER ()
FROM animals() WITH ORDINALITY,
generate_series(1, 3) WITH ORDINALITY as x(gs, counter);
pk | description | mood | ordinality | gs | counter | row_number
----+------------------------+------+------------+----+---------+------------
1 | cat #1 owned by nobody | bad | 1 | 1 | 1 |
2 | cat #2 owned by nobody | good | 2 | 1 | 1 |
3 | cat #3 owned by nobody | good | 3 | 1 | 1 |
4 | cat #4 owned by nobody | good | 4 | 1 | 1 |
5 | cat #5 owned by nobody | good | 5 | 1 | 1 |
6| cat #1 owned by nobody | bad | 1 | 2 | 2 |
7 | cat #2 owned by nobody | good | 2 | 2 | 2 |
8 | cat #3 owned by nobody | good | 3 | 2 | 2 |
9 | cat #4 owned by nobody | good | 4 | 2 | 2 |
10 | cat #5 owned by nobody | good | 5 | 2 | 2 |
11 | cat #1 owned by nobody | bad | 1 | 3 | 3 |
12 | cat #2 owned by nobody | good | 2 | 3 | 3 |
13 | cat #3 owned by nobody | good | 3 | 3 | 3 |
14 | cat #4 owned by nobody | good | 4 | 3 | 3 |
15 | cat #5 owned by nobody | good | 5 | 3 | 3 |
(15 rows)
对于每个generate_series()元组 (column counter),有五个animals()元组 (column ordinality),每个元组都由 逐步跟踪row_number()。
结论
为什么这种顺序性很重要?
您可能很想在函数结果集中包含一些额外的信息,这些信息将简化结果集本身的后处理。当“外部世界”(即使用函数的查询)能够自行添加此类额外信息时,应避免这种做法。您不会浪费资源,还可以让您的代码更简洁、更易读。




