暂无图片
暂无图片
2
暂无图片
暂无图片
暂无图片

PostgreSQL批量插入方式测试

原创 孙莹 2023-10-06
787

PostgreSQL最快批量插入方式测试

copy csv.png

在我们平时日常工作中有很时需要大批量的数据插入操作。本文将测试各种批量插入的方法,通过实战来对比插入的速度

准备环境

首先准备一台2c4g的虚拟机,操作系统CentOS7.6,源码编译安装PostgreSQL14.9,创建数据库testdb、同时创建测试表。分别用不同方式对测试表插入1亿条数据、并记录时间

[root@pg14 ~]# cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core) [root@pg14 ~]# cat /proc/cpuinfo | grep processor | wc -l 2 [root@pg14 ~]# free -h total used free shared buff/cache available Mem: 3.8G 432M 1.6G 68M 1.8G 1.8G Swap: 3.9G 0B 3.9G [root@pg14 ~]# su - postgres Last login: Thu Oct 5 19:05:00 CST 2023 on pts/0 [postgres@pg14 ~]$ postgres --version postgres (PostgreSQL) 14.9 [postgres@pg14 ~]$ createdb testdb [postgres@pg14 ~]$ psql testdb psql (14.9) Type "help" for help. testdb=# CREATE TABLE t (id INT,info TEXT); CREATE TABLE testdb=#

INSERT测试

通过批量执行INSERT INTO的SQL语句进行测试

#构建1亿条插入语句 for((i=1;i<=100000000;i++)); do echo "INSERT INTO t (id,info) VALUES ($i,'hello world');" >> insert_t.sql; done #插入数据库testdb的测试表t中,记录时间 time psql testdb < insert_t.sql > /dev/null

操作如下

[postgres@pg14 ~]$ for((i=1;i<=100000000;i++)); do echo "INSERT INTO t (id,info) VALUES ($i,'hello world');" >> insert_t.sql; done [postgres@pg14 ~]$ time psql testdb < insert_t.sql > /dev/null real 113m51.607s user 6m31.625s sys 6m2.155s [postgres@pg14 ~]$

insert:113分51秒

COPY测试

通过COPY命令将数据从输入源(如文件、标准输入等)复制到数据库表中、分别测试无主键和有主键

#构建1亿条插入语句 echo "COPY t FROM STDIN DELIMITER ',';" > insert_t.csv seq 100000000 | awk '{print $0",hello world"}' >> insert_t.csv #插入数据库testdb的测试表t中,记录时间 time psql testdb < insert_t.csv > /dev/null #清除测试表t DROP TABLE t; #重建测试表t添加主键 CREATE TABLE t (id INT PRIMARY KEY, info TEXT); \q #插入数据库testdb的测试表t有主键中,记录时间 time psql testdb < insert_t.csv > /dev/null

操作如下

[postgres@pg14 ~]$ echo "COPY t FROM STDIN DELIMITER ',';" > insert_t.csv
[postgres@pg14 ~]$ seq 100000000 | awk '{print $0",hello world"}' >> insert_t.csv
[postgres@pg14 ~]$ psql testdb
psql (14.9)
Type "help" for help.

testdb=# drop table t;
DROP TABLE
testdb=# CREATE TABLE t (id INT,info TEXT);
CREATE TABLE
testdb=# \q
[postgres@pg14 ~]$ time psql testdb < insert_t.csv > /dev/null

real    0m34.019s
user    0m3.658s
sys     0m3.465s
[postgres@pg14 ~]$
[postgres@pg14 ~]$ psql testdb
psql (14.9)
Type "help" for help.

testdb=# DROP TABLE t;
DROP TABLE
testdb=# CREATE TABLE t (id INT PRIMARY KEY, info TEXT);
CREATE TABLE
testdb=# \q
[postgres@pg14 ~]$ time psql testdb < insert_t.csv > /dev/null

real    1m27.267s
user    0m3.449s
sys     0m3.433s
[postgres@pg14 ~]$

copy无主键:34秒

copy有主键:1分27秒

pg_bulkload测试

使用pg_bulkload进行测试,源码文件pg_bulkload-VERSION3_1_20.tar.gz下载,上传到/home/postgres并编译安装

#解压源码文件pg_bulkload-VERSION3_1_20.tar.gz tar xzf pg_bulkload-VERSION3_1_20.tar.gz #进入目录 cd pg_bulkload-VERSION3_1_20/ #编译安装pg_bulkload make && make install #进入testdb数据库 psql testdb #创建扩展pg_bulkload CREATE EXTENSION pg_bulkload; #清除测试表t DROP TABLE t; #重新创建 CREATE TABLE t (id INT, info TEXT); \q #删除insert_t.csv的第一行COPY t FROM STDIN DELIMITER ',';保留需要复制的数据 sed -i '1d' insert_t.csv #测试pg_bulkload,记录时间 time pg_bulkload -i insert_t.csv -O t -l outlog.txt -P bad.txt -o "TYPE=CSV" -o "DELIMITER=," -d testdb -U postgres

操作如下

[postgres@pg14 ~]$ tar xzf pg_bulkload-VERSION3_1_20.tar.gz [postgres@pg14 ~]$ cd pg_bulkload-VERSION3_1_20/ [postgres@pg14 pg_bulkload-VERSION3_1_20]$ make && make install Makefile:33: warning: overriding recipe for target `check' /opt/pg14/lib/postgresql/pgxs/src/makefiles/pgxs.mk:446: warning: ignoring old recipe for target `check' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/bin' gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pg_bulkload.o pg_bulkload.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o recovery.o recovery.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pgut/pgut.o pgut/pgut.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pgut/pgut-fe.o pgut/pgut-fe.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pgut/pgut-list.o pgut/pgut-list.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 pg_bulkload.o recovery.o pgut/pgut.o pgut/pgut-fe.o pgut/pgut-list.o -L/opt/pg14/lib -Wl,--as-needed -Wl,-rpath,'/opt/pg14/lib',--enable-new-dtags -Wl,--build-id -L/opt/pg14/lib -lpq -L/opt/pg14/lib/postgresql -lpgcommon -lpgport -lssl -lcrypto -lz -lreadline -lpthread -lrt -ldl -lm -o pg_bulkload make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/bin' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/lib' gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o binary.o binary.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o logger.o logger.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o parser_binary.o parser_binary.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o parser_csv.o parser_csv.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o parser_function.o parser_function.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o parser_tuple.o parser_tuple.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pg_btree.o pg_btree.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pg_bulkload.o pg_bulkload.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pg_strutil.o pg_strutil.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o reader.o reader.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o source.o source.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o writer.o writer.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o writer_binary.o writer_binary.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o writer_buffered.o writer_buffered.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o writer_direct.o writer_direct.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o writer_parallel.o writer_parallel.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pgut/pgut-be.o pgut/pgut-be.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I/opt/pg14/include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pgut/pgut-ipc.o pgut/pgut-ipc.c ( echo '{ global:'; gawk '/^[^#]/ {printf "%s;\n",$1}' exports.txt; echo ' local: *; };' ) >exports.list gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -shared -Wl,--version-script=exports.list -o pg_bulkload.so binary.o logger.o parser_binary.o parser_csv.o parser_function.o parser_tuple.o pg_btree.o pg_bulkload.o pg_strutil.o reader.o source.o writer.o writer_binary.o writer_buffered.o writer_direct.o writer_parallel.o pgut/pgut-be.o pgut/pgut-ipc.o -L/opt/pg14/lib -Wl,--as-needed -Wl,-rpath,'/opt/pg14/lib',--enable-new-dtags -Wl,--build-id -L/opt/pg14/lib -lpq -lpthread sed 's,MODULE_PATHNAME,$libdir/pg_bulkload,g' pg_bulkload.sql.in >pg_bulkload.sql make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/lib' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/util' gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I../include -I. -I./ -I/opt/pg14/include/postgresql/server -I/opt/pg14/include/postgresql/internal -D_GNU_SOURCE -c -o pg_timestamp.o pg_timestamp.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -shared -o pg_timestamp.so pg_timestamp.o -L/opt/pg14/lib -Wl,--as-needed -Wl,-rpath,'/opt/pg14/lib',--enable-new-dtags sed 's,MODULE_PATHNAME,$libdir/pg_timestamp,g' pg_timestamp.sql.in >pg_timestamp.sql make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/util' Makefile:33: warning: overriding recipe for target `check' /opt/pg14/lib/postgresql/pgxs/src/makefiles/pgxs.mk:446: warning: ignoring old recipe for target `check' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/bin' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/bin' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/lib' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/lib' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/util' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/util' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/bin' /bin/mkdir -p '/opt/pg14/bin' make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/bin' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/lib' /bin/mkdir -p '/opt/pg14/lib/postgresql' /bin/mkdir -p '/opt/pg14/share/postgresql/extension' /bin/mkdir -p '/opt/pg14/share/postgresql/extension' make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/lib' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/util' /bin/mkdir -p '/opt/pg14/lib/postgresql' /bin/mkdir -p '/opt/pg14/share/postgresql/contrib' make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/util' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/bin' /bin/mkdir -p '/opt/pg14/bin' /bin/install -c pg_bulkload '/opt/pg14/bin' /bin/install -c -m 755 .//postgresql '/opt/pg14/bin/' make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/bin' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/lib' /bin/mkdir -p '/opt/pg14/lib/postgresql' /bin/mkdir -p '/opt/pg14/share/postgresql/extension' /bin/mkdir -p '/opt/pg14/share/postgresql/extension' /bin/install -c -m 755 pg_bulkload.so '/opt/pg14/lib/postgresql/pg_bulkload.so' /bin/install -c -m 644 .//pg_bulkload.control '/opt/pg14/share/postgresql/extension/' /bin/install -c -m 644 .//pg_bulkload--1.0.sql .//pg_bulkload--unpackaged--1.0.sql .//uninstall_pg_bulkload.sql pg_bulkload.sql '/opt/pg14/share/postgresql/extension/' make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/lib' make[1]: Entering directory `/home/postgres/pg_bulkload-VERSION3_1_20/util' /bin/mkdir -p '/opt/pg14/lib/postgresql' /bin/mkdir -p '/opt/pg14/share/postgresql/contrib' /bin/install -c -m 755 pg_timestamp.so '/opt/pg14/lib/postgresql/pg_timestamp.so' /bin/install -c -m 644 .//uninstall_pg_timestamp.sql pg_timestamp.sql '/opt/pg14/share/postgresql/contrib/' make[1]: Leaving directory `/home/postgres/pg_bulkload-VERSION3_1_20/util' [postgres@pg14 pg_bulkload-VERSION3_1_20]$ cd [postgres@pg14 ~]$ psql testdb psql (14.9) Type "help" for help. testdb=# CREATE EXTENSION pg_bulkload; CREATE EXTENSION testdb=# DROP TABLE t; DROP TABLE testdb=# CREATE TABLE t (id INT, info TEXT); CREATE TABLE testdb=# \q [postgres@pg14 ~]$ sed -i '1d' insert_t.csv [postgres@pg14 ~]$ ll -h total 7.3G -rw-rw-r--. 1 postgres postgres 2.0G Oct 5 22:07 insert_t.csv -rw-rw-r--. 1 postgres postgres 5.3G Oct 5 19:50 insert_t.sql drwxrwxr-x. 10 postgres postgres 175 Jan 16 2023 pg_bulkload-VERSION3_1_20 -rw-r--r--. 1 postgres postgres 377K Oct 5 21:59 pg_bulkload-VERSION3_1_20.tar.gz drwxrwxr-x. 6 postgres postgres 4.0K Oct 5 19:05 postgresql-14.9 -rw-r--r--. 1 postgres postgres 28M Oct 5 19:03 postgresql-14.9.tar.gz [postgres@pg14 ~]$ time pg_bulkload -i insert_t.csv -O t -l outlog.txt -P bad.txt -o "TYPE=CSV" -o "DELIMITER=," -d testdb -U postgres NOTICE: BULK LOAD START NOTICE: BULK LOAD END 0 Rows skipped. 100000000 Rows successfully loaded. 0 Rows not loaded due to parse errors. 0 Rows not loaded due to duplicate errors. 0 Rows replaced with new rows. real 0m17.004s user 0m0.000s sys 0m0.005s [postgres@pg14 ~]$

pg_bulkload:17秒

PostgreSQL16的COPY测试

16新特性COPY比原来的速度提升,先将14.9升级到16测试

#安装pg16所需ICU依赖包 yum install -y libicu-devel #下载源码 wget https://ftp.postgresql.org/pub/source/v16.0/postgresql-16.0.tar.gz --no-check-certificate #解压pg16源码 tar xzf postgresql-16.0.tar.gz #进入pg16目录 cd postgresql-16.0/ #创建pg16软件目录 mkdir -p /opt/pg16 #设置软件安装目录 ./configure --prefix=/opt/pg16 --with-pgport=5432 --with-openssl #源码安装 gmake world && gmake install-world #进入testdb psql testdb #删除pg16不兼容插件 DROP EXTENSION pg_bulkload; \q #创建pg16数据库目录 mkdir /opt/pgdata16 #初始化pg16数据库 /opt/pg16/bin/initdb -D/opt/pgdata16 -k -EUTF8 #检查pg16升级 /opt/pg16/bin/pg_upgrade --old-datadir /opt/pgdata/ --new-datadir /opt/pgdata16/ --old-bindir /opt/pg14/bin/ --new-bindir /opt/pg16/bin/ --check #关闭原pg14.9数据库 pg_ctl stop #进行升级 /opt/pg16/bin/pg_upgrade --old-datadir /opt/pgdata/ --new-datadir /opt/pgdata16/ --old-bindir /opt/pg14/bin/ --new-bindir /opt/pg16/bin/ cd ~ #修改.bash_profile vim .bash_profile #应用.bash_profile source .bash_profile #修改postgresql.conf vim /opt/pgdata16/postgresql.conf #修改pg_hba.conf vim /opt/pgdata16/pg_hba.conf #查看升级后的版本 postgres --version #启动数据库实列 pg_ctl start #执行VACCUMDB整库 /opt/pg16/bin/vacuumdb --all --analyze-in-stages #进入testdb psql testdb #清理测试表t DROP TABLE t; #重新创建测试表t CREATE TABLE t (id INT, info TEXT); #原来insert_t.csv的第一行插入COPY t FROM STDIN DELIMITER ','; sed -i "1i COPY t FROM STDIN DELIMITER ',';" insert_t.csv #在postgresql16版本执行COPY插入测试表t,记录时间

操作如下

gmake[2]: Leaving directory `/home/postgres/postgresql-16.0/contrib/sslinfo' gmake[1]: Leaving directory `/home/postgres/postgresql-16.0/contrib' [postgres@pg14 postgresql-16.0]$ psql testdb psql (14.9) Type "help" for help. testdb=# DROP EXTENSION pg_bulkload; DROP EXTENSION testdb=# \q [postgres@pg14 postgresql-16.0]$ mkdir /opt/pgdata16 [postgres@pg14 postgresql-16.0]$ /opt/pg16/bin/initdb -D/opt/pgdata16 -k -EUTF8 The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default text search configuration will be set to "english". Data page checksums are enabled. fixing permissions on existing directory /opt/pgdata16 ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... PRC creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok initdb: warning: enabling "trust" authentication for local connections initdb: hint: You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: /opt/pg16/bin/pg_ctl -D /opt/pgdata16 -l logfile start [postgres@pg14 postgresql-16.0]$ /opt/pg16/bin/pg_upgrade --old-datadir /opt/pgdata/ --new-datadir /opt/pgdata16/ --old-bindir /opt/pg14/bin/ --new-bindir /opt/pg16/bin/ --check Performing Consistency Checks on Old Live Server ------------------------------------------------ Checking cluster versions ok Checking database user is the install user ok Checking database connection settings ok Checking for prepared transactions ok Checking for system-defined composite types in user tables ok Checking for reg* data types in user tables ok Checking for contrib/isn with bigint-passing mismatch ok Checking for incompatible "aclitem" data type in user tables ok Checking for presence of required libraries ok Checking database user is the install user ok Checking for prepared transactions ok Checking for new cluster tablespace directories ok *Clusters are compatible* [postgres@pg14 postgresql-16.0]$ pg_ctl stop waiting for server to shut down.... done server stopped [postgres@pg14 postgresql-16.0]$ /opt/pg16/bin/pg_upgrade --old-datadir /opt/pgdata/ --new-datadir /opt/pgdata16/ --old-bindir /opt/pg14/bin/ --new-bindir /opt/pg16/bin/ Performing Consistency Checks ----------------------------- Checking cluster versions ok Checking database user is the install user ok Checking database connection settings ok Checking for prepared transactions ok Checking for system-defined composite types in user tables ok Checking for reg* data types in user tables ok Checking for contrib/isn with bigint-passing mismatch ok Checking for incompatible "aclitem" data type in user tables ok Creating dump of global objects ok Creating dump of database schemas ok Checking for presence of required libraries ok Checking database user is the install user ok Checking for prepared transactions ok Checking for new cluster tablespace directories ok If pg_upgrade fails after this point, you must re-initdb the new cluster before continuing. Performing Upgrade ------------------ Setting locale and encoding for new cluster ok Analyzing all rows in the new cluster ok Freezing all rows in the new cluster ok Deleting files from new pg_xact ok Copying old pg_xact to new server ok Setting oldest XID for new cluster ok Setting next transaction ID and epoch for new cluster ok Deleting files from new pg_multixact/offsets ok Copying old pg_multixact/offsets to new server ok Deleting files from new pg_multixact/members ok Copying old pg_multixact/members to new server ok Setting next multixact ID and offset for new cluster ok Resetting WAL archives ok Setting frozenxid and minmxid counters in new cluster ok Restoring global objects in the new cluster ok Restoring database schemas in the new cluster ok Copying user relation files ok Setting next OID for new cluster ok Sync data directory to disk ok Creating script to delete old cluster ok Checking for extension updates ok Upgrade Complete ---------------- Optimizer statistics are not transferred by pg_upgrade. Once you start the new server, consider running: /opt/pg16/bin/vacuumdb --all --analyze-in-stages Running this script will delete the old cluster's data files: ./delete_old_cluster.sh [postgres@pg14 postgresql-16.0]$ vi /home/postgres/.bash_profile #PostgreSQL settings export PGPORT=5432 export PGUSER=postgres export PGHOME=/opt/pg16 export PGDATA=/opt/pgdata16 export LD_LIBRARY_PATH=$PGHOME/lib export MANPATH=$PGHOME/share/man export PATH=$PGHOME/bin:$PATH export LANG="en_US.UTF-8" [postgres@pg14 postgresql-16.0]$ cd [postgres@pg14 ~]$ source .bash_profile [postgres@pg14 ~]$ vim /opt/pgdata16/postgresql.conf listen_addresses = '*' port = 5432 shared_buffers = 1GB work_mem = 30MB maintenance_work_mem = 256MB temp_buffers = 256MB max_connections = 500 checkpoint_completion_target = 0.9 wal_buffers = 16MB min_wal_size = 4GB max_wal_size = 64GB wal_log_hints = on wal_keep_size = 1000 default_statistics_target = 100 random_page_cost = 1.1 effective_io_concurrency = 200 wal_level = replica logging_collector = on log_directory = 'log' log_destination = 'csvlog' log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' log_file_mode = 0600 log_rotation_age = 1d log_rotation_size = 0 log_truncate_on_rotation = off [postgres@pg14 ~]$ vim /opt/pgdata16/pg_hba.conf host all all 0/0 scram-sha-256 [postgres@pg14 ~]$ postgres --version postgres (PostgreSQL) 16.0 [postgres@pg14 ~]$ pg_ctl start waiting for server to start....2023-10-05 22:57:43.828 CST [94268] LOG: redirecting log output to logging collector process 2023-10-05 22:57:43.828 CST [94268] HINT: Future log output will appear in directory "log". done server started [postgres@pg14 ~]$ /opt/pg16/bin/vacuumdb --all --analyze-in-stages vacuumdb: processing database "postgres": Generating minimal optimizer statistics (1 target) vacuumdb: processing database "template1": Generating minimal optimizer statistics (1 target) vacuumdb: processing database "testdb": Generating minimal optimizer statistics (1 target) vacuumdb: processing database "postgres": Generating medium optimizer statistics (10 targets) vacuumdb: processing database "template1": Generating medium optimizer statistics (10 targets) vacuumdb: processing database "testdb": Generating medium optimizer statistics (10 targets) vacuumdb: processing database "postgres": Generating default (full) optimizer statistics vacuumdb: processing database "template1": Generating default (full) optimizer statistics vacuumdb: processing database "testdb": Generating default (full) optimizer statistics [postgres@pg14 ~]$ psql testdb psql (16.0) Type "help" for help. testdb=# DROP TABLE t; DROP TABLE testdb=# CREATE TABLE t (id INT, info TEXT); CREATE TABLE testdb=# \q [postgres@pg14 ~]$ sed -i "1i COPY t FROM STDIN DELIMITER ',';" insert_t.csv [postgres@pg14 ~]$ head -10 insert_t.csv COPY t FROM STDIN DELIMITER ','; 1,hello world 2,hello world 3,hello world 4,hello world 5,hello world 6,hello world 7,hello world 8,hello world 9,hello world [postgres@pg14 ~]$ time psql testdb < insert_t.csv > /dev/null real 0m25.064s user 0m1.273s sys 0m2.768s [postgres@pg14 ~]$

PostgreSQL16的COPY:25秒

总结

通过上面的对1亿条数据插入实测,我们可以得到如下比较的表格。由于pg_bulkload是不需要经过shared buffer和wal buffer,直接写文件所以最快,PG16版本优化了COPY的等待事件开销、文件扩展的优化使得比原先版本的效率有所提升。COPY操作原表有主键效率稍微低一些。而INSERT操作由于需要一条一条SQL解析、执行等就变得效率特别低。

INSERT COPY无主键 COPY有主键 pg_bulkload pg16COPY无主键
113分51秒 34秒 1分27秒 17秒 25秒
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论