暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

磐维数据库---PG_XLOG危机大逃亡:集群误删后的绝地反击指南

原创 磐维数据库 2024-11-30
388

一、简介

在集群环境中,误删pg_xlog目录就像一场意外的“逃亡危机”,一不小心整个系统就可能陷入瘫痪。本文档将带您经历从误删一个节点到三节点全军覆没的WAL日志恢复之旅。无论是轻度的单节点误删,还是严峻的全节点日志丢失,您都将学会如何利用pw_resetxlog等工具完成惊险的“绝地反击”。每个步骤都旨在帮助您快速、安全地恢复集群,让数据在危机中重获新生。

image-20241109212946140

pw_resetxlog的操作会重置数据库的事务日志信息,可能导致数据完整性风险。PostgreSQL 官方建议在执行该操作后,使用逻辑备份pw_dump导出数据,再使用 initdb 重建数据目录并从dump出来的文件中恢复,以确保系统的一致性和数据安全。

image-20241126162129193

二、测试前置步骤

2.1 检查集群状态及检查归档模式

gs_om -t status --detail
show archive_mode;
postgres=# show archive_mode;
 archive_mode 
--------------
 off
(1 row)

2.2 创建测试表–test_data

CREATE TABLE test_data (
    id SERIAL PRIMARY KEY,
    name VARCHAR(50),
    value INT
);

2.3 批量写入数据

这个批量插入数据的PL/pgSQL会循环执行2000次,每次插入1万条记录,累计插入2000万条数据。每批次插入完成后,会立即提交事务。

DO $$
DECLARE
    i INT := 1;
BEGIN
    FOR i IN 1..2000 LOOP
        INSERT INTO test_data (name, value)
        SELECT 'Name_' || seq, seq
        FROM generate_series((i-1)*10000 + 1, i*10000) AS seq;
        -- 每批次插入完1万条后提交
        COMMIT;
    END LOOP;
END $$;

2.4 删除pg_xlog脚本

该脚本会在/database/panweidb/data/pg_xlog下执行rm -rf * 操作,持续执行 60 次,每秒一次。

#!/bin/bash

# 进入需要清理的目录
cd /database/panweidb/data/pg_xlog

# 计数器初始化
counter=0

# 持续执行 60 次,每秒一次
while [ $counter -lt 60 ]; do
    echo "执行第 $((counter+1)) 次删除操作"
    rm -rf *
    sleep 1
    ((counter++))
done

echo "删除任务完成"

三、测试一:单节点删除

3.1 删除后查看主节点状态

gs_om -t status --detail
[omm@panwei-a1 ~]$ gs_om -t status --detail
[  CMServer State   ]

node         node_ip         instance                             state
-------------------------------------------------------------------------
1  panweidb01 xxx   1    /database/panweidb/cm/cm_server Primary
2  panweidb02 xxx   2    /database/panweidb/cm/cm_server Standby
3  panweidb03 xxx   3    /database/panweidb/cm/cm_server Standby

[   Cluster State   ]

cluster_state   : Degraded
redistributing  : No
balanced        : No
current_az      : AZ_ALL

[  Datanode State   ]

node         node_ip         instance                     state            
---------------------------------------------------------------------------
1  panweidb01 xxx   6001 /database/panweidb/data P Dowm    Unknown
2  panweidb02 xxx   6002 /database/panweidb/data S Primary Normal
3  panweidb03 xxx   6003 /database/panweidb/data S Standby Normal

3.2 主节点执行build

[omm@panweidb01 ~]$ pw_ctl build -D $PGDATA
[2024-11-27 09:47:54.733][2449878][][pw_ctl]: gs_ctl incremental build ,datadir is /database/panweidb/data
[2024-11-27 09:47:54.733][2449878][][pw_ctl]: fopen build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 09:47:54.733][2449878][][pw_ctl]: fprintf build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 09:47:54.734][2449878][][pw_ctl]: fsync build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 09:47:54.734][2449878][][pw_ctl]: stop failed, killing panweidb by force ...
[2024-11-27 09:47:54.734][2449878][][pw_ctl]: command [ps c -eo pid,euid,cmd | grep panweidb | grep -v grep | awk '{if($2 == curuid && $1!="-n") print "/proc/"$1"/cwd"}' curuid=`id -u`| xargs ls -l | awk '{if ($NF=="/database/panweidb/data")  print $(NF-2)}' | awk -F/ '{print $3 }' | xargs kill -9 >/dev/null 2>&1 ] path: [/database/panweidb/data] 
[2024-11-27 09:47:54.748][2449878][][pw_ctl]: server stopped
[2024-11-27 09:47:54.749][2449878][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 09:47:54.757][2449878][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 09:47:54.758][2449878][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 09:47:54.796][2449878][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 09:47:54.801][2449878][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 09:47:54.808][2449878][dn_6001_6002_6003][gs_rewind]: get pg_control success
[2024-11-27 09:47:54.808][2449878][dn_6001_6002_6003][gs_rewind]: target server was interrupted in mode 6.
[2024-11-27 09:47:54.808][2449878][dn_6001_6002_6003][gs_rewind]: sanityChecks success
[2024-11-27 09:47:54.808][2449878][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 10/FE000528 and checkpoint redo at 10/FE0004A8 from source control file
[2024-11-27 09:47:54.808][2449878][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 10/F63D9810 and checkpoint redo at 10/F4935DB8 from target control file
[2024-11-27 09:47:54.808][2449878][dn_6001_6002_6003][gs_rewind]: find max lsn fail, errmsg:failed to translate name to xlog: 

 
gs_rewind receive FATAL, it will exit
[2024-11-27 09:47:54.808][2449878][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 09:47:54.809][2449878][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 09:47:54.815][2449878][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 09:47:54.815][2449878][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 09:47:54.850][2449878][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 09:47:54.853][2449878][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 09:47:54.854][2449878][dn_6001_6002_6003][gs_rewind]: get pg_control success
[2024-11-27 09:47:54.854][2449878][dn_6001_6002_6003][gs_rewind]: target server was interrupted in mode 6.
[2024-11-27 09:47:54.854][2449878][dn_6001_6002_6003][gs_rewind]: sanityChecks success
[2024-11-27 09:47:54.854][2449878][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 10/FE000528 and checkpoint redo at 10/FE0004A8 from source control file
[2024-11-27 09:47:54.854][2449878][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 10/F63D9810 and checkpoint redo at 10/F4935DB8 from target control file
[2024-11-27 09:47:54.854][2449878][dn_6001_6002_6003][gs_rewind]: find max lsn fail, errmsg:failed to translate name to xlog: 

 
gs_rewind receive FATAL, it will exit
[2024-11-27 09:47:54.854][2449878][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 09:47:54.855][2449878][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 09:47:54.861][2449878][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 09:47:54.862][2449878][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 09:47:54.896][2449878][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 09:47:54.899][2449878][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 09:47:54.900][2449878][dn_6001_6002_6003][gs_rewind]: get pg_control success
[2024-11-27 09:47:54.900][2449878][dn_6001_6002_6003][gs_rewind]: target server was interrupted in mode 6.
[2024-11-27 09:47:54.900][2449878][dn_6001_6002_6003][gs_rewind]: sanityChecks success
[2024-11-27 09:47:54.900][2449878][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 10/FE000528 and checkpoint redo at 10/FE0004A8 from source control file
[2024-11-27 09:47:54.900][2449878][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 10/F63D9810 and checkpoint redo at 10/F4935DB8 from target control file
[2024-11-27 09:47:54.900][2449878][dn_6001_6002_6003][gs_rewind]: find max lsn fail, errmsg:failed to translate name to xlog: 

 
gs_rewind receive FATAL, it will exit
[2024-11-27 09:47:54.900][2449878][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 09:47:54.901][2449878][dn_6001_6002_6003][pw_ctl]: inc build failed, change to full build.
[2024-11-27 09:47:54.901][2449878][dn_6001_6002_6003][pw_ctl]: current workdir is (/home/omm).
[2024-11-27 09:47:54.901][2449878][dn_6001_6002_6003][pw_ctl]: set gaussdb state file when auto build build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(FULL_BUILD).
[2024-11-27 09:47:54.902][2449878][dn_6001_6002_6003][gs_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 09:47:54.911][2449878][dn_6001_6002_6003][gs_ctl]: build try host(xxx) port(15401) success
[2024-11-27 09:47:54.911][2449878][dn_6001_6002_6003][gs_ctl]: connected to server success, build started.
[2024-11-27 09:47:54.954][2449878][dn_6001_6002_6003][gs_ctl]: clear old target dir success
[2024-11-27 09:47:54.954][2449878][dn_6001_6002_6003][gs_ctl]: create build tag file success
[2024-11-27 09:47:54.954][2449878][dn_6001_6002_6003][gs_ctl]: create build tag file again success
[2024-11-27 09:47:54.954][2449878][dn_6001_6002_6003][gs_ctl]: get system identifier success
[2024-11-27 09:47:54.955][2449878][dn_6001_6002_6003][gs_ctl]: receiving and unpacking files...
[2024-11-27 09:47:54.955][2449878][dn_6001_6002_6003][gs_ctl]: create backup label success
[2024-11-27 09:47:55.153][2449878][dn_6001_6002_6003][gs_ctl]: xlog start point: 10/FE0004A8
[2024-11-27 09:47:55.153][2449878][dn_6001_6002_6003][gs_ctl]: begin build tablespace list
[2024-11-27 09:47:55.153][2449878][dn_6001_6002_6003][gs_ctl]: finish build tablespace list
[2024-11-27 09:47:55.153][2449878][dn_6001_6002_6003][gs_ctl]: begin get xlog by xlogstream
[2024-11-27 09:47:55.153][2449878][dn_6001_6002_6003][gs_ctl]: starting background WAL receiver
[2024-11-27 09:47:55.153][2449878][dn_6001_6002_6003][gs_ctl]: starting walreceiver
[2024-11-27 09:47:55.153][2449878][dn_6001_6002_6003][gs_ctl]: begin receive tar files
[2024-11-27 09:47:55.154][2449878][dn_6001_6002_6003][gs_ctl]: receiving and unpacking files...
[2024-11-27 09:47:55.166][2449878][dn_6001_6002_6003][gs_ctl]: build try host(xxx) port(15401) success
[2024-11-27 09:47:55.166][2449878][dn_6001_6002_6003][gs_ctl]: check identify system success
[2024-11-27 09:47:55.167][2449878][dn_6001_6002_6003][gs_ctl]: send START_REPLICATION 10/FE000000 success
[2024-11-27 09:47:58.693][2449878][dn_6001_6002_6003][gs_ctl]: finish receive tar files
[2024-11-27 09:47:58.693][2449878][dn_6001_6002_6003][gs_ctl]: xlog end point: 10/FF000178
[2024-11-27 09:47:58.693][2449878][dn_6001_6002_6003][gs_ctl]: waiting for background process to finish streaming...
[2024-11-27 09:48:00.204][2449878][dn_6001_6002_6003][gs_ctl]: starting fsync all files come from source.
[2024-11-27 09:48:01.506][2449878][dn_6001_6002_6003][gs_ctl]: finish fsync all files.
[2024-11-27 09:48:01.507][2449878][dn_6001_6002_6003][gs_ctl]: build dummy dw file success
[2024-11-27 09:48:01.507][2449878][dn_6001_6002_6003][gs_ctl]: rename build status file success
[2024-11-27 09:48:01.532][2449878][dn_6001_6002_6003][gs_ctl]: auto build build completed(/database/panweidb/data).
[2024-11-27 09:48:01.583][2449878][dn_6001_6002_6003][gs_ctl]: waiting for server to start...
.0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: panweidb01 

0 LOG:  [Alarm Module]Host IP: panweidb01. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

0 LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
 0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify start.
2024-11-27 09:48:01.675 67467a51.10000 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify success.
2024-11-27 09:48:01.676 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  Error happen when loading license, error code: 2, error message: cannot write data to dir /etc/panweidb/license

2024-11-27 09:48:01.676 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  base_page_saved_interval is 400, ori is 400.
2024-11-27 09:48:01.676 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  Recovery parallelism, cpu count = 2, max = 4, actual = 2
2024-11-27 09:48:01.676 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
2024-11-27 09:48:01.681 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

2024-11-27 09:48:01.681 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host Name: panweidb01 

2024-11-27 09:48:01.681 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host IP: panweidb01. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

2024-11-27 09:48:01.681 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

2024-11-27 09:48:01.681 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

2024-11-27 09:48:01.683 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  loaded library "security_plugin"
2024-11-27 09:48:01.684 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  could not create any HA TCP/IP sockets
2024-11-27 09:48:01.686 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2024-11-27 09:48:01.686 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  reserved memory for backend threads is: 340 MB
2024-11-27 09:48:01.686 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  reserved memory for WAL buffers is: 320 MB
2024-11-27 09:48:01.687 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  Set max backend reserve memory is: 660 MB, max dynamic memory is: 9931 MB
2024-11-27 09:48:01.687 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  shared memory 1184 Mbytes, memory context 10591 Mbytes, max process memory 12288 Mbytes
2024-11-27 09:48:01.718 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [CACHE] LOG:  set data cache  size(402653184)
2024-11-27 09:48:02.050 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [SEGMENT_PAGE] LOG:  Segment-page constants: DF_MAP_SIZE: 8156, DF_MAP_BIT_CNT: 65248, DF_MAP_GROUP_EXTENTS: 4175872, IPBLOCK_SIZE: 8168, EXTENTS_PER_IPBLOCK: 1021, IPBLOCK_GROUP_SIZE: 4090, BMT_HEADER_LEVEL0_TOTAL_PAGES: 8323072, BktMapEntryNumberPerBlock: 2038, BktMapBlockNumber: 25, BktBitMaxMapCnt: 512
2024-11-27 09:48:02.108 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  panweidb: fsync file "/database/panweidb/data/gaussdb.state.temp" success
2024-11-27 09:48:02.109 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Standby), connection index(1)
2024-11-27 09:48:02.135 67467a51.1 [unknown] 22835516318272 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  max_safe_fds = 974, usable_fds = 1000, already_open = 16
.
[2024-11-27 09:48:03.595][2449878][dn_6001_6002_6003][gs_ctl]:  done
[2024-11-27 09:48:03.595][2449878][dn_6001_6002_6003][gs_ctl]: server started (/database/panweidb/data)
[2024-11-27 09:48:03.595][2449878][dn_6001_6002_6003][gs_ctl]: fopen build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 09:48:03.595][2449878][dn_6001_6002_6003][gs_ctl]: fprintf build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 09:48:03.595][2449878][dn_6001_6002_6003][gs_ctl]: fsync build pid file "/database/panweidb/data/gs_build.pid" success

3.3 如有需要可以将集群切换回主节点

cm_ctl switchover -D $PGDATA -n 1

四、测试二:所有节点删除

4.1 三节点执行删除之后状态

[omm@panwei-a1 ~]$ gs_om -t status --detail
[  CMServer State   ]

node         node_ip         instance                             state
-------------------------------------------------------------------------
1  panweidb01 xxx   1    /database/panweidb/cm/cm_server Primary
2  panweidb02 xxx   2    /database/panweidb/cm/cm_server Standby
3  panweidb03 xxx   3    /database/panweidb/cm/cm_server Standby

[   Cluster State   ]

cluster_state   : Unavailable
redistributing  : No
balanced        : No
current_az      : AZ_ALL

[  Datanode State   ]

node         node_ip         instance                     state            
---------------------------------------------------------------------------
1  panweidb01 xxx   6001 /database/panweidb/data P Dowm    Unknown
2  panweidb02 xxx   6002 /database/panweidb/data S Dowm    Unknown
3  panweidb03 xxx   6003 /database/panweidb/data S Dowm    Unknown

4.2 cm_agent日志

image-20241110204204625

4.3 cm_server日志

image-20241110204257701

4.4 开始恢复

4.4.1 停止集群

gs_om -t stop

4.4.2 创建pg_xlog/archive_status

cd $PGDATA/pg_xlog
mkdir archive_status

image-20241110204600260

4.4.3 使用pw_resetxlog重置

pw_resetxlog /database/panweidb/data

4.4.4 主节点以primary单机启动

[omm@panweidb01 pg_xlog]$ pw_ctl start -M primary -D $PGDATA
[2024-11-27 10:04:15.800][2460281][][pw_ctl]: pw_ctl started,datadir is /database/panweidb/data 
[2024-11-27 10:04:15.840][2460281][][pw_ctl]: waiting for server to start...
.0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: panweidb01 

0 LOG:  [Alarm Module]Host IP: panweidb01. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

0 LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
 0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify start.
2024-11-27 10:04:15.921 67467e1f.10000 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify success.
2024-11-27 10:04:15.922 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  Error happen when loading license, error code: 2, error message: cannot write data to dir /etc/panweidb/license

2024-11-27 10:04:15.922 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  base_page_saved_interval is 400, ori is 400.
2024-11-27 10:04:15.922 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  Recovery parallelism, cpu count = 2, max = 4, actual = 2
2024-11-27 10:04:15.922 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
gaussdb.state does not exist, and skipt setting since it is optional.2024-11-27 10:04:15.926 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

2024-11-27 10:04:15.926 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host Name: panweidb01 

2024-11-27 10:04:15.926 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host IP: panweidb01. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

2024-11-27 10:04:15.926 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

2024-11-27 10:04:15.927 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

2024-11-27 10:04:15.928 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  loaded library "security_plugin"
2024-11-27 10:04:15.929 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  could not create any HA TCP/IP sockets
2024-11-27 10:04:15.931 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2024-11-27 10:04:15.931 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  reserved memory for backend threads is: 340 MB
2024-11-27 10:04:15.931 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  reserved memory for WAL buffers is: 320 MB
2024-11-27 10:04:15.931 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  Set max backend reserve memory is: 660 MB, max dynamic memory is: 9931 MB
2024-11-27 10:04:15.931 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  shared memory 1184 Mbytes, memory context 10591 Mbytes, max process memory 12288 Mbytes
2024-11-27 10:04:15.967 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [CACHE] LOG:  set data cache  size(402653184)
2024-11-27 10:04:16.315 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [SEGMENT_PAGE] LOG:  Segment-page constants: DF_MAP_SIZE: 8156, DF_MAP_BIT_CNT: 65248, DF_MAP_GROUP_EXTENTS: 4175872, IPBLOCK_SIZE: 8168, EXTENTS_PER_IPBLOCK: 1021, IPBLOCK_GROUP_SIZE: 4090, BMT_HEADER_LEVEL0_TOTAL_PAGES: 8323072, BktMapEntryNumberPerBlock: 2038, BktMapBlockNumber: 25, BktBitMaxMapCnt: 512
2024-11-27 10:04:16.373 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  panweidb: fsync file "/database/panweidb/data/gaussdb.state.temp" success
2024-11-27 10:04:16.373 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Primary), connection index(1)
2024-11-27 10:04:16.396 67467e1f.1 [unknown] 23284595221056 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  max_safe_fds = 976, usable_fds = 1000, already_open = 14
.
[2024-11-27 10:04:17.849][2460281][][pw_ctl]:  done
[2024-11-27 10:04:17.849][2460281][][pw_ctl]: server started (/database/panweidb/data)

4.4.5 两个备节点以build形式从主库同步数据

[omm@panweidb02 ~]$ pw_ctl build -D $PGDATA
[2024-11-27 10:04:47.069][2279278][][pw_ctl]: gs_ctl incremental build ,datadir is /database/panweidb/data
[2024-11-27 10:04:47.069][2279278][][pw_ctl]: fopen build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:04:47.069][2279278][][pw_ctl]: fprintf build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:04:47.070][2279278][][pw_ctl]: fsync build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:04:47.070][2279278][][pw_ctl]: stop failed, killing panweidb by force ...
[2024-11-27 10:04:47.070][2279278][][pw_ctl]: command [ps c -eo pid,euid,cmd | grep panweidb | grep -v grep | awk '{if($2 == curuid && $1!="-n") print "/proc/"$1"/cwd"}' curuid=`id -u`| xargs ls -l | awk '{if ($NF=="/database/panweidb/data")  print $(NF-2)}' | awk -F/ '{print $3 }' | xargs kill -9 >/dev/null 2>&1 ] path: [/database/panweidb/data] 
[2024-11-27 10:04:47.082][2279278][][pw_ctl]: server stopped
[2024-11-27 10:04:47.083][2279278][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 10:04:47.090][2279278][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:04:47.090][2279278][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 10:04:47.124][2279278][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 10:04:47.130][2279278][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 10:04:47.132][2279278][dn_6001_6002_6003][gs_rewind]: get pg_control success
[2024-11-27 10:04:47.132][2279278][dn_6001_6002_6003][gs_rewind]: target server was interrupted in mode 5.
[2024-11-27 10:04:47.132][2279278][dn_6001_6002_6003][gs_rewind]: sanityChecks success
[2024-11-27 10:04:47.132][2279278][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 11/5000028 and checkpoint redo at 11/5000028 from source control file
[2024-11-27 10:04:47.132][2279278][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 11/4A8 and checkpoint redo at 11/428 from target control file
[2024-11-27 10:04:47.132][2279278][dn_6001_6002_6003][gs_rewind]: find max lsn fail, errmsg:failed to translate name to xlog: 

 
gs_rewind receive FATAL, it will exit
[2024-11-27 10:04:47.132][2279278][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 10:04:47.132][2279278][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 10:04:47.142][2279278][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:04:47.143][2279278][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 10:04:47.176][2279278][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 10:04:47.179][2279278][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 10:04:47.179][2279278][dn_6001_6002_6003][gs_rewind]: get pg_control success
[2024-11-27 10:04:47.179][2279278][dn_6001_6002_6003][gs_rewind]: target server was interrupted in mode 5.
[2024-11-27 10:04:47.179][2279278][dn_6001_6002_6003][gs_rewind]: sanityChecks success
[2024-11-27 10:04:47.179][2279278][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 11/5000028 and checkpoint redo at 11/5000028 from source control file
[2024-11-27 10:04:47.179][2279278][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 11/4A8 and checkpoint redo at 11/428 from target control file
[2024-11-27 10:04:47.179][2279278][dn_6001_6002_6003][gs_rewind]: find max lsn fail, errmsg:failed to translate name to xlog: 

 
gs_rewind receive FATAL, it will exit
[2024-11-27 10:04:47.180][2279278][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 10:04:47.180][2279278][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 10:04:47.186][2279278][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:04:47.186][2279278][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 10:04:47.220][2279278][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 10:04:47.222][2279278][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][gs_rewind]: get pg_control success
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][gs_rewind]: target server was interrupted in mode 5.
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][gs_rewind]: sanityChecks success
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 11/5000028 and checkpoint redo at 11/5000028 from source control file
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][gs_rewind]: find last checkpoint at 11/4A8 and checkpoint redo at 11/428 from target control file
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][gs_rewind]: find max lsn fail, errmsg:failed to translate name to xlog: 

 
gs_rewind receive FATAL, it will exit
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][pw_ctl]: inc build failed, change to full build.
[2024-11-27 10:04:47.223][2279278][dn_6001_6002_6003][pw_ctl]: current workdir is (/home/omm).
[2024-11-27 10:04:47.224][2279278][dn_6001_6002_6003][pw_ctl]: set gaussdb state file when auto build build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(FULL_BUILD).
[2024-11-27 10:04:47.225][2279278][dn_6001_6002_6003][gs_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 10:04:47.230][2279278][dn_6001_6002_6003][gs_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:04:47.230][2279278][dn_6001_6002_6003][gs_ctl]: connected to server success, build started.
[2024-11-27 10:04:47.272][2279278][dn_6001_6002_6003][gs_ctl]: clear old target dir success
[2024-11-27 10:04:47.272][2279278][dn_6001_6002_6003][gs_ctl]: create build tag file success
[2024-11-27 10:04:47.272][2279278][dn_6001_6002_6003][gs_ctl]: create build tag file again success
[2024-11-27 10:04:47.272][2279278][dn_6001_6002_6003][gs_ctl]: get system identifier success
[2024-11-27 10:04:47.273][2279278][dn_6001_6002_6003][gs_ctl]: receiving and unpacking files...
[2024-11-27 10:04:47.273][2279278][dn_6001_6002_6003][gs_ctl]: create backup label success
[2024-11-27 10:04:47.397][2279278][dn_6001_6002_6003][gs_ctl]: xlog start point: 11/5000028
[2024-11-27 10:04:47.397][2279278][dn_6001_6002_6003][gs_ctl]: begin build tablespace list
[2024-11-27 10:04:47.397][2279278][dn_6001_6002_6003][gs_ctl]: finish build tablespace list
[2024-11-27 10:04:47.397][2279278][dn_6001_6002_6003][gs_ctl]: begin get xlog by xlogstream
[2024-11-27 10:04:47.397][2279278][dn_6001_6002_6003][gs_ctl]: starting background WAL receiver
[2024-11-27 10:04:47.397][2279278][dn_6001_6002_6003][gs_ctl]: starting walreceiver
[2024-11-27 10:04:47.397][2279278][dn_6001_6002_6003][gs_ctl]: begin receive tar files
[2024-11-27 10:04:47.397][2279278][dn_6001_6002_6003][gs_ctl]: receiving and unpacking files...
[2024-11-27 10:04:47.407][2279278][dn_6001_6002_6003][gs_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:04:47.407][2279278][dn_6001_6002_6003][gs_ctl]: check identify system success
[2024-11-27 10:04:47.408][2279278][dn_6001_6002_6003][gs_ctl]: send START_REPLICATION 11/5000000 success
[2024-11-27 10:04:49.616][2279278][dn_6001_6002_6003][gs_ctl]: finish receive tar files
[2024-11-27 10:04:49.616][2279278][dn_6001_6002_6003][gs_ctl]: xlog end point: 11/6000058
[2024-11-27 10:04:49.616][2279278][dn_6001_6002_6003][gs_ctl]: waiting for background process to finish streaming...
[2024-11-27 10:04:52.445][2279278][dn_6001_6002_6003][gs_ctl]: starting fsync all files come from source.
[2024-11-27 10:04:53.052][2279278][dn_6001_6002_6003][gs_ctl]: finish fsync all files.
[2024-11-27 10:04:53.053][2279278][dn_6001_6002_6003][gs_ctl]: build dummy dw file success
[2024-11-27 10:04:53.053][2279278][dn_6001_6002_6003][gs_ctl]: rename build status file success
[2024-11-27 10:04:53.072][2279278][dn_6001_6002_6003][gs_ctl]: auto build build completed(/database/panweidb/data).
[2024-11-27 10:04:53.110][2279278][dn_6001_6002_6003][gs_ctl]: waiting for server to start...
.0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: panweidb02 

0 LOG:  [Alarm Module]Host IP: panweidb02. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

0 LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
 0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify start.
2024-11-27 10:04:53.188 67467e45.10000 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify success.
2024-11-27 10:04:53.189 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  Error happen when loading license, error code: 2, error message: cannot write data to dir /etc/panweidb/license

2024-11-27 10:04:53.189 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  base_page_saved_interval is 400, ori is 400.
2024-11-27 10:04:53.190 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  Recovery parallelism, cpu count = 2, max = 4, actual = 2
2024-11-27 10:04:53.190 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
2024-11-27 10:04:53.194 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

2024-11-27 10:04:53.194 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host Name: panweidb02 

2024-11-27 10:04:53.194 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host IP: panweidb02. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

2024-11-27 10:04:53.194 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

2024-11-27 10:04:53.194 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

2024-11-27 10:04:53.195 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  loaded library "security_plugin"
2024-11-27 10:04:53.196 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  could not create any HA TCP/IP sockets
2024-11-27 10:04:53.197 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2024-11-27 10:04:53.198 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  reserved memory for backend threads is: 340 MB
2024-11-27 10:04:53.198 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  reserved memory for WAL buffers is: 320 MB
2024-11-27 10:04:53.198 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  Set max backend reserve memory is: 660 MB, max dynamic memory is: 9931 MB
2024-11-27 10:04:53.198 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  shared memory 1184 Mbytes, memory context 10591 Mbytes, max process memory 12288 Mbytes
2024-11-27 10:04:53.235 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [CACHE] LOG:  set data cache  size(402653184)
2024-11-27 10:04:53.575 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [SEGMENT_PAGE] LOG:  Segment-page constants: DF_MAP_SIZE: 8156, DF_MAP_BIT_CNT: 65248, DF_MAP_GROUP_EXTENTS: 4175872, IPBLOCK_SIZE: 8168, EXTENTS_PER_IPBLOCK: 1021, IPBLOCK_GROUP_SIZE: 4090, BMT_HEADER_LEVEL0_TOTAL_PAGES: 8323072, BktMapEntryNumberPerBlock: 2038, BktMapBlockNumber: 25, BktBitMaxMapCnt: 512
2024-11-27 10:04:53.629 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  panweidb: fsync file "/database/panweidb/data/gaussdb.state.temp" success
2024-11-27 10:04:53.630 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Standby), connection index(1)
2024-11-27 10:04:53.663 67467e45.1 [unknown] 22974537073216 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  max_safe_fds = 974, usable_fds = 1000, already_open = 16
.
[2024-11-27 10:04:55.120][2279278][dn_6001_6002_6003][gs_ctl]:  done
[2024-11-27 10:04:55.120][2279278][dn_6001_6002_6003][gs_ctl]: server started (/database/panweidb/data)
[2024-11-27 10:04:55.120][2279278][dn_6001_6002_6003][gs_ctl]: fopen build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:04:55.120][2279278][dn_6001_6002_6003][gs_ctl]: fprintf build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:04:55.121][2279278][dn_6001_6002_6003][gs_ctl]: fsync build pid file "/database/panweidb/data/gs_build.pid" success
[omm@panweidb03 panweidb]$ pw_ctl build -D $PGDATA
0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: panweidb03 

0 LOG:  [Alarm Module]Host IP: panweidb03. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

0 LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
 0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify start.
2024-11-27 10:07:49.874 67467ef5.10000 [unknown] 23056320307776 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify success.
[2024-11-27 10:07:49.880][2272773][][pw_ctl]: gs_ctl incremental build ,datadir is /database/panweidb/data
[2024-11-27 10:07:49.881][2272773][][pw_ctl]: fopen build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:07:49.881][2272773][][pw_ctl]: fprintf build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:07:49.881][2272773][][pw_ctl]: fsync build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:07:49.881][2272773][][pw_ctl]: stop failed, killing panweidb by force ...
[2024-11-27 10:07:49.881][2272773][][pw_ctl]: command [ps c -eo pid,euid,cmd | grep panweidb | grep -v grep | awk '{if($2 == curuid && $1!="-n") print "/proc/"$1"/cwd"}' curuid=`id -u`| xargs ls -l | awk '{if ($NF=="/database/panweidb/data")  print $(NF-2)}' | awk -F/ '{print $3 }' | xargs kill -9 >/dev/null 2>&1 ] path: [/database/panweidb/data] 
[2024-11-27 10:07:49.893][2272773][][pw_ctl]: server stopped
[2024-11-27 10:07:49.893][2272773][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 10:07:49.900][2272773][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:07:49.901][2272773][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 10:07:49.936][2272773][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 10:07:49.939][2272773][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 10:07:49.939][2272773][dn_6001_6002_6003][gs_rewind]: could not open file "/database/panweidb/data/global/pg_control" for reading: No such file or directory
 
gs_rewind receive FATAL, it will exit
[2024-11-27 10:07:49.939][2272773][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 10:07:49.939][2272773][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 10:07:49.946][2272773][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:07:49.947][2272773][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 10:07:49.981][2272773][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 10:07:49.984][2272773][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 10:07:49.984][2272773][dn_6001_6002_6003][gs_rewind]: could not open file "/database/panweidb/data/global/pg_control" for reading: No such file or directory
 
gs_rewind receive FATAL, it will exit
[2024-11-27 10:07:49.984][2272773][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 10:07:49.985][2272773][dn_6001_6002_6003][pw_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 10:07:49.992][2272773][dn_6001_6002_6003][pw_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:07:49.992][2272773][dn_6001_6002_6003][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2024-11-27 10:07:50.027][2272773][dn_6001_6002_6003][gs_rewind]: connected to server: host=xxx port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5  rw_timeout=600
[2024-11-27 10:07:50.029][2272773][dn_6001_6002_6003][gs_rewind]: connect to primary success
[2024-11-27 10:07:50.029][2272773][dn_6001_6002_6003][gs_rewind]: could not open file "/database/panweidb/data/global/pg_control" for reading: No such file or directory
 
gs_rewind receive FATAL, it will exit
[2024-11-27 10:07:50.029][2272773][dn_6001_6002_6003][pw_ctl]: inc build failed.
[2024-11-27 10:07:50.029][2272773][dn_6001_6002_6003][pw_ctl]: inc build failed, change to full build.
[2024-11-27 10:07:50.029][2272773][dn_6001_6002_6003][pw_ctl]: current workdir is (/database/panweidb).
[2024-11-27 10:07:50.030][2272773][dn_6001_6002_6003][pw_ctl]: set gaussdb state file when auto build build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(FULL_BUILD).
[2024-11-27 10:07:50.031][2272773][dn_6001_6002_6003][gs_ctl]: Get repl_auth_mode is  and repl_uuid is 
[2024-11-27 10:07:50.037][2272773][dn_6001_6002_6003][gs_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:07:50.037][2272773][dn_6001_6002_6003][gs_ctl]: connected to server success, build started.
[2024-11-27 10:07:50.038][2272773][dn_6001_6002_6003][gs_ctl]: clear old target dir success
[2024-11-27 10:07:50.038][2272773][dn_6001_6002_6003][gs_ctl]: create build tag file success
[2024-11-27 10:07:50.038][2272773][dn_6001_6002_6003][gs_ctl]: create build tag file again success
[2024-11-27 10:07:50.038][2272773][dn_6001_6002_6003][gs_ctl]: get system identifier success
[2024-11-27 10:07:50.038][2272773][dn_6001_6002_6003][gs_ctl]: receiving and unpacking files...
[2024-11-27 10:07:50.038][2272773][dn_6001_6002_6003][gs_ctl]: create backup label success
[2024-11-27 10:07:50.168][2272773][dn_6001_6002_6003][gs_ctl]: xlog start point: 11/8000268
[2024-11-27 10:07:50.168][2272773][dn_6001_6002_6003][gs_ctl]: begin build tablespace list
[2024-11-27 10:07:50.168][2272773][dn_6001_6002_6003][gs_ctl]: finish build tablespace list
[2024-11-27 10:07:50.168][2272773][dn_6001_6002_6003][gs_ctl]: begin get xlog by xlogstream
[2024-11-27 10:07:50.168][2272773][dn_6001_6002_6003][gs_ctl]: starting background WAL receiver
[2024-11-27 10:07:50.168][2272773][dn_6001_6002_6003][gs_ctl]: starting walreceiver
[2024-11-27 10:07:50.168][2272773][dn_6001_6002_6003][gs_ctl]: begin receive tar files
[2024-11-27 10:07:50.168][2272773][dn_6001_6002_6003][gs_ctl]: receiving and unpacking files...
[2024-11-27 10:07:50.179][2272773][dn_6001_6002_6003][gs_ctl]: build try host(xxx) port(15401) success
[2024-11-27 10:07:50.179][2272773][dn_6001_6002_6003][gs_ctl]: check identify system success
[2024-11-27 10:07:50.179][2272773][dn_6001_6002_6003][gs_ctl]: send START_REPLICATION 11/8000000 success
[2024-11-27 10:07:51.961][2272773][dn_6001_6002_6003][gs_ctl]: finish receive tar files
[2024-11-27 10:07:51.961][2272773][dn_6001_6002_6003][gs_ctl]: xlog end point: 11/9000058
[2024-11-27 10:07:51.961][2272773][dn_6001_6002_6003][gs_ctl]: waiting for background process to finish streaming...
[2024-11-27 10:07:55.233][2272773][dn_6001_6002_6003][gs_ctl]: starting fsync all files come from source.
[2024-11-27 10:07:55.852][2272773][dn_6001_6002_6003][gs_ctl]: finish fsync all files.
[2024-11-27 10:07:55.852][2272773][dn_6001_6002_6003][gs_ctl]: build dummy dw file success
[2024-11-27 10:07:55.852][2272773][dn_6001_6002_6003][gs_ctl]: rename build status file success
[2024-11-27 10:07:55.871][2272773][dn_6001_6002_6003][gs_ctl]: auto build build completed(/database/panweidb/data).
[2024-11-27 10:07:55.910][2272773][dn_6001_6002_6003][gs_ctl]: waiting for server to start...
.0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: panweidb03 

0 LOG:  [Alarm Module]Host IP: panweidb03. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

0 LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
 0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify start.
2024-11-27 10:07:55.991 67467efb.10000 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  the config file /database/panweidb/data/postgresql.conf verify success.
2024-11-27 10:07:55.992 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  Error happen when loading license, error code: 2, error message: cannot write data to dir /etc/panweidb/license

2024-11-27 10:07:55.992 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  base_page_saved_interval is 400, ori is 400.
2024-11-27 10:07:55.992 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  Recovery parallelism, cpu count = 2, max = 4, actual = 2
2024-11-27 10:07:55.992 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 DB010  0 [REDO] LOG:  ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
2024-11-27 10:07:55.997 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

2024-11-27 10:07:55.997 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host Name: panweidb03 

2024-11-27 10:07:55.997 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Host IP: panweidb03. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>

2024-11-27 10:07:55.997 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Cluster Name: panweidb_Cluster 

2024-11-27 10:07:55.997 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58

2024-11-27 10:07:55.998 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  loaded library "security_plugin"
2024-11-27 10:07:55.999 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 01000  0 [BACKEND] WARNING:  could not create any HA TCP/IP sockets
2024-11-27 10:07:56.000 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2024-11-27 10:07:56.000 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  reserved memory for backend threads is: 340 MB
2024-11-27 10:07:56.000 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  reserved memory for WAL buffers is: 320 MB
2024-11-27 10:07:56.000 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  Set max backend reserve memory is: 660 MB, max dynamic memory is: 9931 MB
2024-11-27 10:07:56.000 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  shared memory 1184 Mbytes, memory context 10591 Mbytes, max process memory 12288 Mbytes
2024-11-27 10:07:56.033 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [CACHE] LOG:  set data cache  size(402653184)
2024-11-27 10:07:56.345 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [SEGMENT_PAGE] LOG:  Segment-page constants: DF_MAP_SIZE: 8156, DF_MAP_BIT_CNT: 65248, DF_MAP_GROUP_EXTENTS: 4175872, IPBLOCK_SIZE: 8168, EXTENTS_PER_IPBLOCK: 1021, IPBLOCK_GROUP_SIZE: 4090, BMT_HEADER_LEVEL0_TOTAL_PAGES: 8323072, BktMapEntryNumberPerBlock: 2038, BktMapBlockNumber: 25, BktBitMaxMapCnt: 512
2024-11-27 10:07:56.406 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  panweidb: fsync file "/database/panweidb/data/gaussdb.state.temp" success
2024-11-27 10:07:56.406 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Standby), connection index(1)
2024-11-27 10:07:56.433 67467efb.1 [unknown] 22832148980288 [unknown] 0 dn_6001_6002_6003 00000  0 [BACKEND] LOG:  max_safe_fds = 974, usable_fds = 1000, already_open = 16
.
[2024-11-27 10:07:57.921][2272773][dn_6001_6002_6003][gs_ctl]:  done
[2024-11-27 10:07:57.921][2272773][dn_6001_6002_6003][gs_ctl]: server started (/database/panweidb/data)
[2024-11-27 10:07:57.921][2272773][dn_6001_6002_6003][gs_ctl]: fopen build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:07:57.921][2272773][dn_6001_6002_6003][gs_ctl]: fprintf build pid file "/database/panweidb/data/gs_build.pid" success
[2024-11-27 10:07:57.922][2272773][dn_6001_6002_6003][gs_ctl]: fsync build pid file "/database/panweidb/data/gs_build.pid" success

4.4.6 启动集群

cm_ctl start

4.4.7 使用pw_dumpall导出数据

pw_dumpall -f backup_all.sql -p 15400

image-20241127101715772

4.4.8 使用pw_initdb初始化实例

pw_initdb  -D /home/omm/data/panweidb --nodename panweidb -w xxx --dbcompatibility=A

4.4.9 数据库启动后恢复数据

 psql -d postgres -p 15400 -f /home/omm/backup_all.sql > restore.log
最后修改时间:2024-11-30 23:17:05
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论