暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

Mogdb 一波三折的升级测试(上)

原创 由迪 2022-03-25
1023

由于工作需要,最近开始学习Mogdb,由于之前做过一段时间文档审核,所以保留了2.0.1 环境,Mogdb 最新版本为2.1.0 所以准备升级一下,本文档只是记录升级过程,系统版本为Centos 7.6,当时还不支持Centos 8.当时安装采用的是极简单极点安装。不知道在升级中会出现什么问题,安装文档可以参考:https://docs.mogdb.io/zh/mogdb/v2.0.1/2-installation-on-a-single-node。升级文档参考:https://docs.mogdb.io/zh/mogdb/v2.1/upgrade-guide

升级前准备与检查

升级前准备与检查清单

表 3 升级前准备清单

序号 升级准备项目项目 准备内容 建议起始时间 耗时(天/小时/分钟)
1 收集节点信息 收集到数据库涉及节点的名称、IP地址,root、omm用户密码等环境信息。 升级前一天 1小时
2 设置root用户远程登录 设置配置文件,允许root用户远程登录。 升级前一天 2小时
3 备份数据 参考《管理指南》中的“备份与恢复”章节进行。 升级前一天 备份数据量和方案不同,耗时也不同
4 获取并校验升级包 获取升级软件包,进行完整性校验。 升级前一天 0.5小时
5 健康检查 使用gs_checkos工具完成操作系统状态检查。 升级前一天 0.5小时
6 检查数据库节点磁盘使用率 使用df命令查看磁盘使用率。 升级前一天 0.5小时
7 检查数据库状态 使用gs_om工具完成数据库状态检查。 升级前一天 0.5小时

按照以上部骤,逐一操作。

先参考文档备份一下数据库:https://docs.mogdb.io/zh/mogdb/v2.0.1/1-2-br

按文档中语法先看工具的版本:

gs_basebackup -? | --help

好奇怪语法,没见过

运行一下

[omm@mogdb-1 ~]$ gs_basebackup -? | --help bash: --help: 未找到命令...

不知道文档中要表达什么?

看一下版本:

gs_basebackup -V | --version

语法依然奇怪,执行一下

[omm@mogdb-1 ~]$ gs_basebackup -V | --version
bash: --version: 未找到命令...

看来文档还是有待提高。

mkdir /home/omm/backup

gs_basebackup -D /home/omm/backup -h 127.0.0.1 -p 26000 -Fplain -Xstream

gs_basebackup: could not connect to server: could not connect to server: 操作现在正在进行 Is the server running on host "127.0.0.1" and accepting TCP/IP connections on port 26001?

连接不上。

果断放弃。

换另一种工具gs_probackup

[omm@mogdb-1 mogdb]$ gs_om -t status --detail
[ Cluster State ]

cluster_state : Normal
redistributing : No
current_az : AZ_ALL

[ Datanode State ]

node node_ip instance state

1 mogdb-1 192.168.17.130 6001 /mogdb/data/db1 P Primary Normal

跟据文档中操作步骤:

[omm@mogdb-1 mogdb]$ gs_probackup init -B /home/omm/backup
INFO: Backup catalog '/home/omm/backup' successfully inited

[omm@mogdb-1 mogdb]$ gs_probackup add-instance -B/home/omm/backup -D /mogdb/data/db1/ --instance=6001
INFO: Instance '6001' successfully inited

[omm@mogdb-1 ~]$ gs_probackup set-config -B /home/omm/backup --instance=6001

[omm@mogdb-1 ~]$ gs_probackup backup -B /home/omm/backup -b full --instance=6001
INFO: Backup start, gs_probackup version: 2.4.2, instance: 6001, backup ID: R957ZL, backup mode: FULL, wal mode: STREAM, remote: false, compress-algorithm: none, compress-level: 1
LOG: Backup destination is initialized
ERROR: could not connect to database omm: connect to server failed: No such file or directory

WARNING: Backup R957ZL is running, setting its status to ERROR

默认成了omm 数据库,这应该是一个BUG,

问了一下同事,指定数据库,和端口号:

[omm@mogdb-1 ~]$ gs_probackup backup -B /home/omm/backup -b full --instance=6001 -d postgres -p 26000
INFO: Backup start, gs_probackup version: 2.4.2, instance: 6001, backup ID: R9588Y, backup mode: FULL, wal mode: STREAM, remote: false, compress-algorithm: none, compress-level: 1
LOG: Backup destination is initialized
WARNING: This openGauss instance was initialized without data block checksums. gs_probackup have no way to detect data block corruption without them. Reinitialize PGDATA with option '--data-checksums'.
LOG: Database backup start
INFO: Cannot parse path "base"
LOG: started streaming WAL at 0/4000000 (timeline 1)
[2022-03-22 19:07:46]: check identify system success
`[2022-03-22 19:07:46]: send START_REPLICATION 0/4000000 success
[2022-03-22 19:07:46]: keepalive message is received
[2022-03-22 19:07:46]: keepalive message is received
INFO: PGDATA size: 569MBINFO: Start transferring data filesLOG: Creating page header map "/home/omm/backup/backups/6001/R9588Y/page_header_map"INFO: Data files are transferred, time elapsed: 3s[2022-03-22 19:07:49]: keepalive message is received
INFO: wait for pg_stop_backup()[2022-03-22 19:07:52]: keepalive message is received
[2022-03-22 19:07:52]: keepalive message is received
[2022-03-22 19:07:55]: keepalive message is received
[2022-03-22 19:07:58]: keepalive message is received
[2022-03-22 19:07:58]: keepalive message is received
[2022-03-22 19:08:01]: keepalive message is received
[2022-03-22 19:08:04]: keepalive message is received
[2022-03-22 19:08:04]: keepalive message is received
[2022-03-22 19:08:07]: keepalive message is received
[2022-03-22 19:08:10]: keepalive message is received
[2022-03-22 19:08:10]: keepalive message is received
[2022-03-22 19:08:13]: keepalive message is received
[2022-03-22 19:08:16]: keepalive message is received
[2022-03-22 19:08:16]: keepalive message is received
[2022-03-22 19:08:19]: keepalive message is received
[2022-03-22 19:08:22]: keepalive message is received
[2022-03-22 19:08:22]: keepalive message is received
[2022-03-22 19:08:25]: keepalive message is received
[2022-03-22 19:08:28]: keepalive message is received
[2022-03-22 19:08:28]: keepalive message is received
[2022-03-22 19:08:31]: keepalive message is received
[2022-03-22 19:08:34]: keepalive message is received
[2022-03-22 19:08:34]: keepalive message is received
[2022-03-22 19:08:37]: keepalive message is received
[2022-03-22 19:08:40]: keepalive message is received
[2022-03-22 19:08:40]: keepalive message is received
[2022-03-22 19:08:43]: keepalive message is received
[2022-03-22 19:08:46]: keepalive message is received
[2022-03-22 19:08:46]: keepalive message is received
[2022-03-22 19:08:49]: keepalive message is received
WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (60 seconds elapsed)HINT: Check that your archive_command is executing properly. pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.CONTEXT: referenced column: lsn[2022-03-22 19:08:52]: keepalive message is received
[2022-03-22 19:08:52]: keepalive message is received
[2022-03-22 19:08:55]: keepalive message is received
[2022-03-22 19:08:58]: keepalive message is received
[2022-03-22 19:08:58]: keepalive message is received
[2022-03-22 19:09:01]: keepalive message is received
[2022-03-22 19:09:04]: keepalive message is received
[2022-03-22 19:09:04]: keepalive message is received
[2022-03-22 19:09:07]: keepalive message is received
[2022-03-22 19:09:10]: keepalive message is received
[2022-03-22 19:09:10]: keepalive message is received
[2022-03-22 19:09:13]: keepalive message is received
[2022-03-22 19:09:16]: keepalive message is received
[2022-03-22 19:09:16]: keepalive message is received
[2022-03-22 19:09:19]: keepalive message is received
[2022-03-22 19:09:22]: keepalive message is received
[2022-03-22 19:09:22]: keepalive message is received
[2022-03-22 19:09:25]: keepalive message is received
[2022-03-22 19:09:28]: keepalive message is received
[2022-03-22 19:09:28]: keepalive message is received
[2022-03-22 19:09:31]: keepalive message is received
[2022-03-22 19:09:34]: keepalive message is received
[2022-03-22 19:09:34]: keepalive message is received
[2022-03-22 19:09:37]: keepalive message is received
[2022-03-22 19:09:40]: keepalive message is received
[2022-03-22 19:09:40]: keepalive message is received
[2022-03-22 19:09:43]: keepalive message is received
[2022-03-22 19:09:46]: keepalive message is received
[2022-03-22 19:09:46]: keepalive message is received
[2022-03-22 19:09:49]: keepalive message is received
WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (120 seconds elapsed)HINT: Check that your archive_command is executing properly. pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.CONTEXT: referenced column: lsn[2022-03-22 19:09:52]: keepalive message is received
[2022-03-22 19:09:52]: keepalive message is received
[2022-03-22 19:09:55]: keepalive message is received
[2022-03-22 19:09:58]: keepalive message is received
[2022-03-22 19:09:58]: keepalive message is received
[2022-03-22 19:10:01]: keepalive message is received
[2022-03-22 19:10:04]: keepalive message is received
[2022-03-22 19:10:04]: keepalive message is received
[2022-03-22 19:10:07]: keepalive message is received
[2022-03-22 19:10:10]: keepalive message is received
[2022-03-22 19:10:10]: keepalive message is received
[2022-03-22 19:10:13]: keepalive message is received
[2022-03-22 19:10:16]: keepalive message is received
[2022-03-22 19:10:16]: keepalive message is received
[2022-03-22 19:10:19]: keepalive message is received
[2022-03-22 19:10:22]: keepalive message is received
[2022-03-22 19:10:22]: keepalive message is received
[2022-03-22 19:10:25]: keepalive message is received
[2022-03-22 19:10:28]: keepalive message is received
[2022-03-22 19:10:28]: keepalive message is received
[2022-03-22 19:10:31]: keepalive message is received
[2022-03-22 19:10:34]: keepalive message is received
[2022-03-22 19:10:34]: keepalive message is received
[2022-03-22 19:10:37]: keepalive message is received
[2022-03-22 19:10:40]: keepalive message is received
[2022-03-22 19:10:40]: keepalive message is received
[2022-03-22 19:10:43]: keepalive message is received
[2022-03-22 19:10:47]: keepalive message is received
[2022-03-22 19:10:47]: keepalive message is received
[2022-03-22 19:10:50]: keepalive message is received
[2022-03-22 19:10:53]: keepalive message is received
[2022-03-22 19:10:53]: keepalive message is received
[2022-03-22 19:10:56]: keepalive message is received
[2022-03-22 19:10:59]: keepalive message is received
[2022-03-22 19:10:59]: keepalive message is received
[2022-03-22 19:11:02]: keepalive message is received
[2022-03-22 19:11:05]: keepalive message is received
[2022-03-22 19:11:05]: keepalive message is received
[2022-03-22 19:11:08]: keepalive message is received
[2022-03-22 19:11:11]: keepalive message is received
[2022-03-22 19:11:11]: keepalive message is received
[2022-03-22 19:11:14]: keepalive message is received
[2022-03-22 19:11:17]: keepalive message is received
[2022-03-22 19:11:17]: keepalive message is received
[2022-03-22 19:11:20]: keepalive message is received
[2022-03-22 19:11:23]: keepalive message is received
[2022-03-22 19:11:23]: keepalive message is received
[2022-03-22 19:11:26]: keepalive message is received
[2022-03-22 19:11:29]: keepalive message is received
[2022-03-22 19:11:29]: keepalive message is received
[2022-03-22 19:11:32]: keepalive message is received
[2022-03-22 19:11:35]: keepalive message is received
[2022-03-22 19:11:35]: keepalive message is received
[2022-03-22 19:11:38]: keepalive message is received
[2022-03-22 19:11:41]: keepalive message is received
[2022-03-22 19:11:41]: keepalive message is received
[2022-03-22 19:11:44]: keepalive message is received
[2022-03-22 19:11:47]: keepalive message is received
[2022-03-22 19:11:47]: keepalive message is received
[2022-03-22 19:11:50]: keepalive message is received
WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (240 seconds elapsed)HINT: Check that your archive_command is executing properly. pg_stop_backup can be canceled safely, but the database backup will not be usable without all the WAL segments.CONTEXT: referenced column: lsn[2022-03-22 19:11:53]: keepalive message is received
[2022-03-22 19:11:53]: keepalive message is received
[2022-03-22 19:11:56]: keepalive message is received
[2022-03-22 19:11:59]: keepalive message is received
[2022-03-22 19:11:59]: keepalive message is received
[2022-03-22 19:12:02]: keepalive message is received
[2022-03-22 19:12:05]: keepalive message is received
[2022-03-22 19:12:05]: keepalive message is received
[2022-03-22 19:12:08]: keepalive message is received
[2022-03-22 19:12:11]: keepalive message is received
[2022-03-22 19:12:11]: keepalive message is received
[2022-03-22 19:12:14]: keepalive message is received
[2022-03-22 19:12:17]: keepalive message is received
[2022-03-22 19:12:17]: keepalive message is received
[2022-03-22 19:12:20]: keepalive message is received
[2022-03-22 19:12:23]: keepalive message is received
[2022-03-22 19:12:23]: keepalive message is received
[2022-03-22 19:12:26]: keepalive message is received
[2022-03-22 19:12:29]: keepalive message is received
[2022-03-22 19:12:29]: keepalive message is received
[2022-03-22 19:12:32]: keepalive message is received
[2022-03-22 19:12:35]: keepalive message is received
[2022-03-22 19:12:35]: keepalive message is received
[2022-03-22 19:12:38]: keepalive message is received
[2022-03-22 19:12:41]: keepalive message is received
[2022-03-22 19:12:41]: keepalive message is received
[2022-03-22 19:12:44]: keepalive message is received
[2022-03-22 19:12:47]: keepalive message is received
[2022-03-22 19:12:47]: keepalive message is received
[2022-03-22 19:12:50]: keepalive message is received
WARNING: Cancel request sentERROR: pg_stop_backup doesn’t answer in 300 seconds, cancel itWARNING: Backup R9588Y is running, setting its status to ERROR``

还是没有备份成功。

还是用传统方法吧,备份有多种方法,只要能成功就行。

为了保持数据库的一致性,先关数据库。

[omm@mogdb-1 ~]$ gs_om -t stop

Stopping cluster.

Successfully stopped cluster.

End stop cluster.

备份数据文件

备份开始

tar -zcvf mogdb.tar.gz /mogdb

........

/mogdb/data/db1/pg_llog/snapshots/
/mogdb/data/db1/pg_llog/mappings/
/mogdb/data/db1/pg_errorinfo/
/mogdb/data/db1/pg_location/
/mogdb/data/db1/PG_VERSION
/mogdb/data/db1/pg_ctl.lock
/mogdb/data/db1/postgresql.conf.lock
/mogdb/data/db1/postgresql.conf
/mogdb/data/db1/mot.conf
/mogdb/data/db1/pg_hba.conf
/mogdb/data/db1/pg_ident.conf
/mogdb/data/db1/postgresql.conf.bak
/mogdb/data/db1/server.crt
/mogdb/data/db1/server.key
/mogdb/data/db1/cacert.pem
/mogdb/data/db1/server.key.cipher
/mogdb/data/db1/server.key.rand
/mogdb/data/db1/pg_hba.conf.lock
/mogdb/data/db1/pg_hba.conf.bak
/mogdb/data/db1/gaussdb.state
/mogdb/data/db1/postmaster.opts
/mogdb/data/db1/gswlm_userinfo.cfg

........

备份软件

tar czvf opt.tar.gz /opt

就在备份的时候发现虚拟机硬盘损坏,正好是/opt下,好吧,为子不影响升级过程 ,以免造成升级失败。我的方法是为虚拟机创建一个新硬盘挂到//opt 重新装一下可爱的Mogdb.

/opt/software/mogdb/script/gs_install -X /opt/software/mogdb/clusterconfig.xml --gsinit-parameter="--locale=en_US.UTF-8" --gsinit-parameter="--encoding=UTF-8"

执行重装。报错,安装提示已经安装了,

没有办法,只好读源代码了,这也许是开源数据库有优势,但是产品要求的是高度集成化,稳定。

读源代码发现:

vim /opt/mogdb/tools/script/local/CheckInstall.py

image-20220325143239076

居然用一个环境变量,来区分是否有过安装,好吧,找到变量干掉他。

找到了.bashrc 里面,清空了,再次执行。

又报错,找到不到GPHOME

好吧,还不能清空,把之前复制回来,

此时产生一个问题?

export GAUSS_ENV=2

之前是2,触发了GAUSS_51806 的错误,Eygle大师之前讲过,猜测大法,

那么将GAUSS_ENV=1 应该就是没有安装过的意思。

改完变量要重新加载变量

source .bashrc

或者exit 再次su - omm

我这么懒的人,当然选择后者,按六下键盘就可以搞定。

接着搞,

再次执行安装

又报

/opt/software/mogdb/script/gspylib/os/./…/…/…/lib/psutil/_psutil_linux.so

权限不对,好吧。

chmod -R 755 /opt/software/mogdb/script/gspylib/os/./…/…/…/lib/psutil/_psutil_linux.so

再次运行安装程序。

终于顺利的安装了。

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论