【译】使用select *时排除某些列的psql小技巧

原创 Robin 2022-05-26

6124

原文地址：https://postgresql.verite.pro/blog/2022/02/21/psql-hack-select-except.html

有时在编写以SELECT * FROM …，开头的查询时，我们希望从结果中排除一些列。不幸的是，SQL语法并没有为这种需求提供解决方案^[1]；相反，我们需要列出所有要包括的列。最近，在pgsql-general邮件列表中的一个子讨论建议，对于某些使用案例，如果客户端能够将*转换为实际的列名，那么客户端的解决方案可能就足够好了。
这让我想到：psql不是已经有做这个的部分了吗？

自11版以来，它可以用\gdesc查询一个查询的结果集的结构，而不执行它（只是准备好了查询）。结果是一个带有名称和类型的列的列表。
自第12版以来，它可以用\pset format csv将该结构输出为CSV。
如果在调用\gdesc之前给出了\o filename，该输出可以写入一个文件。
它可以用 \e调用一个可配置的编辑器。如果那个编辑器用select <post-processed list obtained from \gdesc>替换select *，那么我们就接近目标了。
最后，编辑器可以通过 \setenv 在其环境中传递缓冲区外的背景信息。

但是，我们能否将这些部分组成某种命令，使之完全自动化？碰巧的是，这是有可能的。让我们来看看如何做…

psqlrc

首先，我们需要在.psqlrc文件中进行这两个声明。

\setenv PSQL_EDITOR ~/bin/psql-edit-replace.sh

-- declare an :expand variable to use as a macro (sort of)
-- must be kept as a single long line

\set expand ' \\set _tmpstruct `tempfile` \\setenv PSQL_TMP_STRUCT :_tmpstruct \\set QUIET on \\pset format csv \\x off \\pset tuples_only off \\o :_tmpstruct \\gdesc \\o \\pset format aligned \\set QUIET off \\e \\unset _tmpstruct'

现在，当我们在一个有效的查询结束时，在psql中输入:expand和回车键（而不是最后的分号;或 \g元命令），它将运行这一长串的元命令，这些命令基本上是将查询描述成一个单独的临时文件，并启动我们由PSQL_EDITOR指向的自定义编辑器。

该调用有一些副作用：它将以下参数重置为其默认值。QUIET=off, format=aligned, tuples_only=off, expanded=off。这不容易避免，所以我没有坚持保存和恢复这些参数，尽管我认为可以通过一些额外的工作来实现。

自定义前端编辑器

我的自定义编辑器是一个bash脚本（源代码在文章末尾），作为普通编辑器的前端。在把查询传给它之前，它将检查$PSQL_TMP_STRUCT所指向的文件，以及查询缓冲区第一行中是否存在特定的标记。我选择的标记是：

* /*expand*/：这个文本将被查询的列所取代。
* /*except:col1,col2,...*/：这段文字将被查询的列所取代，但在此所列的列不会显示。
* /*except-type:bytea,jsonb,...*/：这段文字将被查询的列所取代，所列类型的列不显示。

如果结构文件存在并且发现了这些标记之一，一个嵌入的Perl脚本就会通过查询缓冲区，在原地重写它。然后调用$EDITOR（默认为vi）所指向的 "正常 "编辑器。当然，一旦进入真正的编辑器，通常可以进行任何额外的编辑，就像注入的列列表最初是在psql中输入的一样。

实例

让我们创建几个表并尝试每种形式的调用。

CREATE TABLE users(user_id uuid PRIMARY KEY, name text, picture bytea);
CREATE TABLE posts(post_id uuid PRIMARY KEY, user_id uuid REFERENCES users(user_id), subject text, contents text, created_at timestamptz);

列出全部列

在psql中调用：

SELECT * /*expand*/ FROM users JOIN posts USING(user_id) :expand

在编辑器中获得：

SELECT "user_id",
        "name",
        "picture",
        "post_id",
        "subject",
        "contents",
        "created_at"
 FROM users JOIN posts USING(user_id)

按名称删除一些列

在psql中调用：

SELECT * /*except:picture,contents*/ FROM users JOIN posts USING(user_id) :expand

结果：

SELECT "name",
        "subject",
        "contents",
        "created_at"
 FROM users JOIN posts USING(user_id)

按类型删除一些列

在psql中调用：

SELECT * /*except-type:uuid,bytea*/ FROM users JOIN posts USING(user_id) :expand

结果：

SELECT "name",
        "subject",
        "contents",
        "created_at"
 FROM users JOIN posts USING(user_id)

前端实现

下面是一个bash+perl编辑器前端的源代码，做了上面描述的事情。注意，Perl部分只有在psql中使用了:expand时才会运行，这样可以最大限度地减少资源占用和在正常使用 \e 或 \ef时弄乱缓冲区的风险。

这些文件可以在github上找到。如果你想讨论这些代码，可以在那里开问题。

#!/bin/bash

# A custom editor for psql that pre-processes the query string
# to replace "* /* special comment */" with a list of columns.
# The columns are passed in a temporary file pointed to by
# the PSQL_TMP_STRUCT environment variable.

# Set up PSQL_EDITOR to point to that script.
# See the macro invocation in psqlrc-for-edit-replace

read -r line1 < "$1"
rx='\*\s*/\*(expand|except:|except-type:).*\*/'
if [[ $line1 =~ $rx && -r "$PSQL_TMP_STRUCT" ]]; then
  perl - $1 "$PSQL_TMP_STRUCT" << "EOP"
require 5.014;
use Text::CSV qw(csv);

sub expand {
  # filter and format the list of columns
  my ($cols,$filter_type,$filter) = @_;
  # filter_type => undef:none, 0:by name, 1: by type
  my $qi = 1; # quote the columns (for case sensitive names and reserved keywords)
  if (defined $filter_type) {
    my @xcols = split /,/, $filter;	# list of arguments inside the comment
    my %xhcols = map { $_=>1 } @xcols;
    $cols = [ grep { !defined $xhcols{$_->[$filter_type]} } @{$cols} ];
  }
  return join ",\n\t", (map { $qi?('"' . $_->[0]=~ s/"/""/r . '"') : $_->[0]}
  	      	        @{$cols});
}

my $cols = csv(in=>$ARGV[1], headers=>"skip", binary=>1);
open(my $fi, "<", $ARGV[0]) or die "cannot open $ARGV[0]: $!";
my $lines = <$fi>;   # 1st line of query

my $rx = qr{^(.*)\*\s*/\*expand\*/(.*)$};
if ($lines =~ $rx) {
  # expand to all columns
  $lines = "$1" . expand($cols, undef, undef) . "\n$2";
}
else {
  $rx = qr{^(.*)\*\s*/\*except:(.*)\*/(.*)$};
  if ($lines =~ $rx) {
    # expand to all columns except those listed
    $lines = "$1" . expand($cols, 0, $2) . "\n$3";
  }
  else {
    $rx = qr{^(.*)\*\s*/\*except-type:(.*)\*/(.*)$};
    if ($lines =~ $rx) {
	  # expand to all column except for the types listed
      $lines = "$1" . expand($cols, 1, $2) . "\n$3";
    }
  }
}
# copy the rest of the lines
do {
  $lines .= $_;
} while (<$fi>);
close $fi;
# overwrite the file with the new query
open (my $fo, ">", $ARGV[0]) or die "cannot open $ARGV[0] for writing: $!";
print $fo $lines;
close $fo;
EOP

  # When the replacement in the query buffer occurred, we could
  # return into psql at this point rather than going into the actual
  # editor.
  # But before version 13, psql won't display the modified
  # query when returning at this point, so it might seem opaque.
  # Let's always call the actual editor, but you may uncomment
  # the line below to skip it.

  # rm -f "$PSQL_TMP_STRUCT" ; exit
fi
rm -f "$PSQL_TMP_STRUCT"
${EDITOR:-vi} $*

SQL:2016标准定义了多态函数表，有可能被用来建立一个解决方案，但这种非常特殊的函数在PostgreSQL中还没有实现。 ↩︎

墨力计划 psql

最后修改时间：2022-05-26 16:33:08

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

【译】使用select *时排除某些列的psql小技巧

psqlrc

自定义前端编辑器

实例

列出全部列

按名称删除一些列

按类型删除一些列

前端实现

评论