启动失败过程
[omm@dba data]$ gs_om -t start Starting cluster. ========================================= ========================================= [GAUSS-53600]: Can not start the database, the cmd is source /home/omm/.bashrc; python3 '/opt/software/opengauss/om/script/local/StartInstance.py' -U omm -R /opt/software/opengauss/install/app -t 300 --security-mode=off, Error: [FAILURE] dba: [GAUSS-51607] : Failed to start instance. Error: Please check the gs_ctl log for failure details. [2023-01-29 11:58:04.941][3089][][gs_ctl]: gs_ctl started,datadir is /opt/software/opengauss/install/data [2023-01-29 11:58:04.991][3089][][gs_ctl]: waiting for server to start... .0 LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env. 0 LOG: [Alarm Module]Host Name: dba 0 LOG: [Alarm Module]Host IP: dba. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP> 0 LOG: [Alarm Module]Cluster Name: dbCluster 0 LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 57 0 WARNING: failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory. 0 WARNING: failed to parse feature control file: gaussdb.version. 0 WARNING: Failed to load the product control file, so gaussdb cannot distinguish product version. 2023-01-29 11:58:05.080 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 DB010 0 [REDO] LOG: Recovery parallelism, cpu count = 4, max = 4, actual = 4 2023-01-29 11:58:05.080 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 DB010 0 [REDO] LOG: ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4 2023-01-29 11:58:05.085 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env. 2023-01-29 11:58:05.085 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Host Name: dba 2023-01-29 11:58:05.085 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Host IP: dba. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP> 2023-01-29 11:58:05.085 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Cluster Name: dbCluster 2023-01-29 11:58:05.085 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 57 2023-01-29 11:58:05.087 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: loaded library "security_plugin" 2023-01-29 11:58:05.088 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 42809 0 [BACKEND] LOG: could not bind IPv4 socket at the 0 time: Cannot assign requested address 2023-01-29 11:58:05.088 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 42809 0 [BACKEND] HINT: Port 15400 is used, run 'netstat -anop|grep 15400' or 'lsof -i:15400'(need root) to see who is using this port. .2023-01-29 11:58:06.090 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 42809 0 [BACKEND] LOG: could not bind IPv4 socket at the 1 time: Cannot assign requested address 2023-01-29 11:58:06.090 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 42809 0 [BACKEND] HINT: Port 15400 is used, run 'netstat -anop|grep 15400' or 'lsof -i:15400'(need root) to see who is using this port. .2023-01-29 11:58:07.091 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 42809 0 [BACKEND] LOG: could not bind IPv4 socket at the 2 time: Cannot assign requested address 2023-01-29 11:58:07.091 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 42809 0 [BACKEND] HINT: Port 15400 is used, run 'netstat -anop|grep 15400' or 'lsof -i:15400'(need root) to see who is using this port. .2023-01-29 11:58:08.094 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: exec cmd: lsof -i:15400 sh: lsof: command not found 2023-01-29 11:58:08.100 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: exec cmd: netstat -anp | grep 15400 (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) 2023-01-29 11:58:08.128 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: <netstat>:tcp 0 0 127.0.0.1:15400 0.0.0.0:* LISTEN 3092/gaussdb 2023-01-29 11:58:08.128 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 00000 0 [BACKEND] LOG: <netstat>:tcp6 0 0 ::1:15400 :::* LISTEN 3092/gaussdb 2023-01-29 11:58:08.129 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 42809 0 [BACKEND] FATAL: could not create listen socket for "192.168.162.78:15400" [2023-01-29 11:58:09.000][3089][][gs_ctl]: waitpid 3092 failed, exitstatus is 256, ret is 2 [2023-01-29 11:58:09.000][3089][][gs_ctl]: stopped waiting [2023-01-29 11:58:09.000][3089][][gs_ctl]: could not start server Examine the log output.. [omm@dba data]$
定位错误即 FATAL
2023-01-29 11:58:08.129 63d5eecd.1 [unknown] 140549185963904 [unknown] 0 dn_6001 42809 0 [BACKEND] FATAL: could not create listen socket for “192.168.162.78:15400”
解决思路
有错误信息可见是IP地址出现问题,先查询pg_hba.conf文件,在查询操作系统网卡信息
postgresql.conf
# - Connection Settings -
listen_addresses = 'localhost,192.168.162.78' # what IP address(es) to listen on;
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
local_bind_address = '192.168.162.78'
port = 15400 # (change requires restart)
max_connections = 5000 # (change requires restart)
# Note: Increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).
#sysadmin_reserved_connections = 3 # (change requires restart)
unix_socket_directory = '/opt/software/opengauss/tmp' # (change requires restart)
#unix_socket_group = '' # (change requires restart)
unix_socket_permissions = 0700 # begin with 0 to use octal notation
# (change requires restart)
# - Security and Authentication -
pg_hba.conf文件
[omm@dba data]$ tail -10 pg_hba.conf
# IPv4 local connections:
host all all 127.0.0.1/32 trust
host all all 192.168.162.78/32 sha256
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
#local replication omm trust
#host replication omm 127.0.0.1/32 trust
#host replication omm ::1/128 trust
[omm@dba data]$
[omm@dba data]$
[omm@dba data]$
操作系统网卡信息
[omm@dba data]$ cat /etc/sysconfig/network-scripts/ifcfg-ens33 TYPE="Ethernet" PROXY_METHOD="none" BROWSER_ONLY="no" BOOTPROTO="dhcp" DEFROUTE="yes" IPV4_FAILURE_FATAL="no" IPV6INIT="yes" IPV6_AUTOCONF="yes" IPV6_DEFROUTE="yes" IPV6_FAILURE_FATAL="no" IPV6_ADDR_GEN_MODE="stable-privacy" NAME="ens33" UUID="6896f8ed-e7a9-408e-9401-09ae5dcaabba" DEVICE="ens33" ONBOOT="yes" MTU="8192" [omm@dba data]$ [omm@dba data]$
由此可见,是因为使用了操作系统启用了IP自动分配,系统重启后IP地址发生变化,而openGauss的两个配置文件postgresql.conf和pg_hba.conf未修改。
解决方案-根据IP修改配置文件
修改postgresql.conf
# - Connection Settings -
listen_addresses = 'localhost,192.168.43.43' # what IP address(es) to listen on;
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
local_bind_address = '192.168.43.43'
port = 15400 # (change requires restart)
max_connections = 5000 # (change requires restart)
# Note: Increasing max_connections costs ~400 bytes of shared memory per
# connection slot, plus lock space (see max_locks_per_transaction).
#sysadmin_reserved_connections = 3 # (change requires restart)
unix_socket_directory = '/opt/software/opengauss/tmp' # (change requires restart)
#unix_socket_group = '' # (change requires restart)
unix_socket_permissions = 0700 # begin with 0 to use octal notation
# (change requires restart)
修改pg_hba.conf
[omm@dba data]$ tail -10 pg_hba.conf
# IPv4 local connections:
host all all 127.0.0.1/32 trust
host all all 192.168.43.43/32 sha256
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
#local replication omm trust
#host replication omm 127.0.0.1/32 trust
#host replication omm ::1/128 trust
[omm@dba data]$
[omm@dba data]$
顺利启动
[omm@dba data]$ gs_om -t start Starting cluster. ========================================= [SUCCESS] dba 2023-01-29 12:45:15.831 63d5f9db.1 [unknown] 140479240804224 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets 2023-01-29 12:45:15.831 63d5f9db.1 [unknown] 140479240804224 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets 2023-01-29 12:45:15.833 63d5f9db.1 [unknown] 140479240804224 [unknown] 0 dn_6001 01000 0 [BACKEND] WARNING: Failed to initialize the memory protect for g_instance.attr.attr_storage.cstore_buffers (1024 Mbytes) or shared memory (3608 Mbytes) is larger. ========================================= Successfully started. [omm@dba data]$ [omm@dba data]$ [omm@dba data]$ gs_om -t status ----------------------------------------------------------------------- cluster_name : dbCluster cluster_state : Normal redistributing : No ----------------------------------------------------------------------- [omm@dba data]$ [omm@dba data]$ gsql -d postgres -p 15400 gsql ((openGauss 3.1.1 build 70980198) compiled at 2023-01-06 09:34:59 commit 0 last mr ) Non-SSL connection (SSL connection is recommended when requiring high-security) Type "help" for help. openGauss=#
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




